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Preface to the first Edition. 


Mathematical statistics is the tool whose help enables the 
statistician to draw conclusions from his statistical material. As 
with any other tool* the result and the value of its application 
depend primarily upon how the director of the work understand the 
execution of his assignment. This tool can just as easily be mis¬ 
used or perverted as properly used--perhaps more easily--especial- 
ly since the mathematical apparatus has recently received a sharp¬ 
ness \diich did not even approximately exist formerly. Mathematical 
statistics is no automaton* wherein one need only insert the sta¬ 
tistical data and then after some mechanical operations read off 
as from a calculating machine* the result. One is not always cer¬ 
tain of obtaining by such mechanical operations the correct answer 
to the problem. 

With this reservation* then* it must be said »that mathematical 
statistics is just as necessary for the statistician as the knife 
for the surgeon. The statement proper of the question--that is* 
the formulation of the problem to be solved and likewise the col¬ 
lecting of the statistical data relevant to and shedding light on 
the quest ion--involves two tasks that demand special technical 
knowledge in that scientific branch to which the problem belongs. 
Once* however* the data have been collected and the relevant ques¬ 
tion formulated* to answer that question is a task that lies whol¬ 
ly in the realm of mathematical statistics. 

Mathematical statistics dates* as a science* from the begin¬ 
ning of the eighteenth century. The famed theorem of Bernoulli re¬ 
mains today the foundation of the structure of statistics. On this 
is founded the fundamental principle of.all statistics: to derive* 
from statistically large numbers* the laws of complex events* 
which underlie the fluctuating numerical statistical series. The 
‘divine order* of bussMiLuH is perhaps to be taken as the most 
complete expression of the viewpoint of applied statistics during 
the time immediatelv after the publication of the fundamental 
theorem of Bernoulli. 

The‘developement which began with Bernoulli received, princi¬ 
pally through the investigations of De Moivre, Gauss, and Laplace, 
a certain conclusion. The Th^orie anaivtique des probabilites of 
Laplace [14] is doubtless the most significant work that has ap¬ 
peared in the realm or mnticaJ statistics. Unfortunately, ex- 
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cepting Poisson's elegant developement of some theorems of 
Laplace t the many points of attach found in the work of Laplace 
remained, as go^ as unobserved* Only in very recent years have we 
opened our eyes to the great collection of undeveloped basic 
theorems which are entombed in the great work of Laplace. 

Responsibility for the stagnation which set in at that time in 
the developemeht of mathematical statistics rests principally upon 
GAUSS. This great mathematician believed himself able to prove 
that fluctuations in the elements of a statistical series ••he con- 
earned himself chiefly with series of astronomical and geodetic 
observations-•follow strictly the simple law called after him the 
Gaussian law of error. Where deviations appeared* he believed that 
he could attribute them solely to the small number of observe^ 
tions. He stated--on the basis of an erroneous mathematical proof 
*>-that the deviations would vanish if only the number of observa¬ 
tions were sufficiently large. This theorem pervaded all mathema¬ 
tical statistics of the nineteenth century like an article of 
faith, and the method of least souares. based on the Gaussian law 
of error, was, and often still is, considered a definitive solution 
of the problem of strict scientific treatment of statistical 
aeries of observations. 

In application of statistics other than to astronomy, quetelet 
appears to be him who applied the principles of the Gaussian law 
of error with greatest success. His doctrine of type is certainly 
a most iiiportant basic theorem of statistics; but one must care* 
fully guard against the presumption that the mathematical type of 
the statistical object necessarily corresponds to an actual (phy^* 
ical, biological) type. In this last respect Quetelet was guilty 
of great excesses and thereby, in many respects against his will, 
brought a discredit upon mathematical statistics from which it has 
not yet altogether recovered. 

Three years after the death of Quetelet appeared the paper of 
Lexis ^ Zur Theorie der Maasenerscheinungen in der menachlichen 
Geaailwchaft TlSl , which contains the first essential p^orress in 
mathematical statistics since the days of Laplace. Lexis shews 
here that statistical events do not altogether follow, as it was 
custcxtiary to assume, the Bernoullian laws of probability. He gives 
(more completely in other papers) an explanation by mathematical 
•tafiatics of these deviations and is led thereby to a simple cri* 
terion which serves to estimate the intensity of the foreign 
disturbances to s^ich a statistical event is exposed. 

Nevertheless the Gaussian law of error remained unshaken. 
Then, however, at the end of the nineteenth century, came the 
breakthrough. Just as has so often been the case, as history 
shews, with the progress of new scientific truths, here also came 
the discovery sifmilt*aneously or nearly simultaneously from many 
sides. It is particularly interesting to observe, as sigpifyingL 
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the universality of statistical science* how representatives of 
the furthest parts of science reached the same goal by more or 
less different paths. We find among the pioneers the astronomers 
tH-tCLE and Bruns, the psychologists Fechner and Lippsw the bio« 
logists Galton and Pearson, the political economists Edgeworth 
snd others. Today there is a ferment of new thoughts and discover¬ 
ies in this field of labour, so that one may well say that 
mathematical statistics has unfolded in the last decades into a 
new science. 

Among the investigators who together have worked here must be 
particularly noticed the English mathematician and biologist Karl 
Pearson, professor of applied mathematics (mechanics) at the 
University of London. In a rather large series of papers, princi-^ 
pally in the Philoaophical Transactions under the general title 
‘Mathematical contributions to the theory of evolution* [l8] • this 
outstanding scholar has attacked and solved a class of difficulti 
problems in mathematical statistics; he has moreover succeeded in 
assembling about him a numerous school of men and women scholars 
of the most different branches of the science, who work with un¬ 
flagging zeal at the further development of the ideas of Pearson. 

Although, during my years of statistical study, I was moved to 
acquaint myself as carefully as possible with the results of 
Pearson and his disciples, the exposition of mathematical statis¬ 
tics given here is not directly based on the investigations of 
Pearson. Without wishing to undertake a detailed critique of his 
investigations, which, moreover, I most highlv admire. I neverthe«> 
less believe it necessary to remark that the methods of Pearson 
possess an essential error, which consists in lacking sufficient 
generality both in the choice of the starting point and in the 
practical application. His 7 types of error lawl are unquestior- 
ably admirable formulae of interpolation; but they are derived 
without reference to the genetic developement of such laws of 
error. Also there is no place in them for the computation of the 
higher characteristics of statistical series. His theory of 
correlation has a similar flaw. 

My treatment of mathematical statistics is based on two small 
notices in the Meddelandan fran Lunds Obssrvatorium [2; s] • The im¬ 
mediate starting point for deriving the general form of statistic¬ 
al series can be found in Laplace’s Thiorie analytique, and I 
found later L5J • that the same basic principles can be extended to 
the general form of correlation functions. Thus the entire field of 
mathmatical statistics, as far as it is now worked out, receives 
a unified treatment. Not only is clearness in the mathematical 

t rea.tment of the problem obtained in this manner, but also it is 
seen that the resulting formulae for practical numerical computa¬ 
tion take an exceedingly simple form. I have convinced myself in 

1. <The Pearson system has been extended tq 13 types. --Tr.> 
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several practical cases that these formulae can be applied and 
used by persons who have had no opportunity for a special inathe* 
matical education. 

In the present redact ion* which I undertook at the request of 
Professor Fahlbeck for the Statavetenskaplig Tidakrift, most 
mathematical derivations are omitted. I will however display the 
signification of the formulae by means of numerical examples from 
various fields of applied statistics. My treatment here will prin* 
cipally be only a resume of practical prescriptions for the 
numerical treatment of statistical series. 

Before beginning my account, I feel that a few words should be 
said about the division of the subject of statistics that I have 
found necessary for several reasons to introduce--a division which 
perhaps may appear strange to professional statisticians. I divide 
statistics into two essentially different parts; 

I. Ilomograde or alternative statistics, 

IT. Heterograde or qualitative statistics. 


These two parts differ in the nature of the original lists, in 
the manner of portraying the statistical series, and, last and most 
important, in the problems that the statistician has to solve. In 
homograde statistics the theorems of Bernoulli and of lexis fur¬ 
nish the ruling concepts. Their use yields conclusions as to the 
extent of the foreign disturbances which work upon the statistical 
object, and the task of the statistician i» to investigate and 
distinguish these foreign influences. In heterograde statistics 
the mathematicsl theory gives no information about the types in¬ 
volved, which must be determined solely from the empirical data. 

The connexion between different statistical events is given, 
in both parts of statistics, by the theory of correlation, which, 
however, takes on a somewhat different form within each of the two 
principal parts of statistics. 
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Preface to the second Edition. 


This edition is principally an unchanged translation of the 
first (Swedish) edition. Chapter XV: ‘Abbreviated methods for the 
computaJt'ion of the characteristics' and the tables of the func¬ 
tions Qp R, (J\), (P 3 , and d)^ have been added. 

My hearty thanks are due to Messrs messov» dnd Baade. of the 
Hamburg observatory, whb have kindly helped me with the proof¬ 
reading, and to' the printer, lutcke a Wulff, Hamburg, which has 
most graciously undertaken to print the German edition. 

LUND Observatory, 1 January 1920 , 


C. V. L. Charlier. 


Translator’s Preface^ 

The text of the present work, which was undertaken at the sug¬ 
gestion of Professor w. L. Crum of Harvard, follows closely the 
edition of 1920. Footnotes which the translator has added are en¬ 
closed in brackets and indicated by the mark --Tr. 

The numerical examples have been taken unchanged from the 
edition of 1920 with one important exception. Both the translator 
and Protessor Crum deem the introduction ot the factor 5 into 
equation (2) of Chapter X., which Charlier explains in the words 
‘Die y-Koordinaten sind mit einem beliebig gewahlten Faktor (5) 
multiplisiert, urn bei dem Zeiohnen der Kurve ( auf gewohnlichem 
quadrierten Papier) dieseJbe Skala fur beide Koordinaten zu 
erhalten', unnecessary and likely to confuse the beginning stu¬ 
dent. This factor Jias therefore been suppressed and lables 23, 24, 
25, 26, 27, Figures 1, 2, 3, 4. 5, and the diagrams of Art. 62 arwl 
of the appendix altered accordingly. 

The Edgeworth term in qp^ and its influence on the excess have 
been added in footnotes. 

Table 41 is composed as follows: and ^Pq ®re taken from 

Sheppard [19 , II: 2 -?] . cp j , (p , , and (P5 were computed by the 
translator, partly to thirteen places of decimals, partly to ten 
places, using as basis lowan’s table of Cp-Q [I 6 ] and J^trgensen’s 
tables of the second, third, and fifth Hermits polynomials [l2 , 
VI; 196-199 , 202-203], and then rounded to seven places, qp 4 
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is from Charlier’s Table V. cp^ is cut doivn to four places from 
J/TRGENSEN [12 . V: 178-193]. 

Table 42 has been checked against Kondo and Elderton L20 . II: 
2-10]. Eipht last-olpce errors have been corrected. 

Table 43 reproduces the tables of bortkiewit^ [l, 49-52] . As 
Soper [19, p. Ixxvj] points out, the fourth digit of bortkiewicz 
is not always correct; errors found by comparison with Soper's 
table [l9, LI: 113-117] have been corrected. 

Table 44 was newly conrputed by the translator. 

The translator wishes to acknowledge the assistance of Messrs 
L. R. Brooks, P. V. Bugstrom, H. V. du Bouchet, Mac Kinnon A. 
Greeley, Caroline Neef, G. K. Takayama, and H. A. Thomas jr. 
Cambridge. Massachusetts, 19 January 1947. 
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Part One. 


Homograde or alternative Statistics. 


Cap. I. Introduction. Definition of homograde 

and of heterograde Statistics. 

Tlie primary data from which statistical numbers are derived 
comprise a list of individuals for which a certain attribute has 
been observed and possibly measured. The individuals can be men, 
animals, plants, or even inanimate or abstract things and phaeno- 
mena; the attribute may be a length, a weight, or any other 
property of the individuals. 

The collection of all individuals which are observed in a cer¬ 
tain statistical investigation forms a population. 

The original collection of observations has been called the 
original list. 

The attribute being studied may in general assume a continuous 
or discontiPtxjus range of degrees of intensity. If this intensity 
has been measured, it can be expressed by a multiple or other 
function of a suitable unit. The intensity measured in this manner 
is called the degree of the attribute. 

The simplest method of constructing an original list is simply 
to note whether a certain attribute--or possibly a certain degree 
of this attribute--is present or absent among the individuals 
which belong to the population. The original list has then the 
following aF>pearance: 


Table 1. 

Original list of homogrado Indlvlduala. 


•yr 


Designation 
of the 
individual 

The attribute 

present 

absent 

/. 

, 


/. 


i 1 

h 


1 

n 

1 


I. 

1 
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It is here assumed that the individuals I\, I 4 , and J 5 pos¬ 
sess the attribute in question, but that the individuals 1 2 and I 3 
do not. 

In publishing observations of this kind it is seldom necessary 
or even useful to give the original list in this original form. 
The whole population is then divided into groups consisting of, 
say, sj, S 2 $ •••• individuals, so that 

(1) 5, -j- “h ^8 4“ • • • 4“ 

where F denotes the total nuni^er of individuals in the population 
and N the nunfcer of groups. 

If in particular sj , S 2 » S 3 , ..., contain the same number 
(= s) of individuals--which is often approximately the case-* 
equation assumes the form 

Ns = P. 

Now let us designate the number of individuals in the differ¬ 
ent groups that possess the attribute in question by nii , /n 2 , 

.... /n^. We obtain as the result of observations the series 

• • •> 

which is called a statistical series of homograde quantities. 

The numbers mj, /j^ , m 3 , ..., are called the elements of the 
statistical series; sj, S 2 , are called the numbers of 

comparison of the elements. 

When the form of original list is prepared as in Table 1, we 
have to decide whether the attribute in question is present or ab¬ 
sent in each individual. If the attribute in question can occur in 
only one degree, then all individuals are identical in this 
respect. This is the case with statistics of the number of 
men or women in a country, of the number of births, deaths, 
immigrants, emigrants, school children, suicides, etc. A person 
is dead or not, newborn or not, suicide or not, and a graduation 
of these properties is not possible, except under highly arti¬ 
ficial assumptions. 

It is not however necessary in preparing Table 1 that all the 
individuals listed in the second column possess the attribute in 
question in the same degree. Observe ex.gr, a population of grown 
men and arrange them according to their cephalic index (100 times 
breadth of head divided by length). If all men with an index 
smaller than, say, 80, are called dolichocephalic, then we can 
construct an original list like Table 1 by designating the attri¬ 
bute in question as ‘dolichocephaly*. Obviously all individuals 
who are designated as dolichocephalic are not identical with re¬ 
spect to the attribute in question, but in the preparation of the 
original table the gradation of the attribute can be neglected. 



If the individuals all possess an attribute in the same degree 
or if no attention is paid in the statistical investigation to th.* 
different intensity of the attribute in different individuals, -t 
is said that the individuals are homograde. The part of statistics 
concerned with such individuals is called homograde statistics. 

With reference to the manner of preparation of the original 
list it can also be called alternative statistics. 

Oontrariwise, individuals which possess a certain attribute in 
different degrees are called heterograde, and that part, of 
statistics concerned with such individuals (quantities) is called 
heterograde statistics. Instead of this expression the name 
qualitative statistics may be used. 




The customary form of a primary 
list of heterograde individuals is the 
'Drm of the table adjacent. 

Here the numbers xj, X2, X3, X4, 

Xg, etc., shew the intensity (the de¬ 
gree) of the attribute i^ question in 
each individual, expressed in a suit¬ 
able unit (meter, liter, kilogram, 
year, etc.) The numbers 

Xs, . . ., 

now directly form the statistical 
series, and each X|^ is an element of 
the series. 

As examples of statistical attributes which belong to hetero¬ 
grade statistics may be named the duration of life or of an 
illness, the length, volume, or weight of animals, plants, or in¬ 
animate objects, the length of the period of gestation, the colour 
of the hair or of the eyes, the spectrum of the stars, etc. 

Cap. II. The arithmetic Mean. 

The first task of mathematical statistics is to show how the 
characteristic properties of a statistical series, or, more gener¬ 
ally, of a nuniDer of simultaneously considered statistical series 
can be determined. Experience has taught that in the apparent dis¬ 
order that a statistical series displays simple laws rule and 
permit each statistical series or group of simultaneously con¬ 
sidered interdependent statistical series to be simply and 
uniquely characterised. 

The numbers that express the essential properties of a 
statistical series arc called in this work the characteristics of 
the statistical series. 

In most cases four of these characteristics are sufficient. 


Table 2, 

original of hstarograds 
individusls 


Designation j 
of the 
individual 

Lt^ree 0 
the attribu 

/, 

Xy 

/. 

x% 

/. 

Xt 

u 

Xk 

h 
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For rather small series (= series with a small nunnber M of ele¬ 
ments) two of them sof.ice* In excef>tional cases one can be 
j^'.isfied with a single characteristic. 

The first four characteristics have received the following 
names: 

1.. the mediuml or arithmetic mean (M); 

2. the dispersion^ (a); 

3. the skewness or a8:^ninetry (5); 

4. the excess (£). 

I shall assume, here and in several chapters to follow, that 
those individuals ^ich belong to the original list in Table 1 are 
divided into N groups, each with the same number of comparison 
(a). The general case, where this number varies from group to 
group, will be treated in Cap. VII. 

Now let 

(1) iHj, m,, m„ .... nis 

l»e the giv^n statistical series whose characteristics are to be 
determined. Hie arithmetic mean is defined by the formula 


( 2 ) 


M = 


«*i + »«» + TO, + ... + m„ ^ 

N 


It may seem superfluous to spend many words on so well-known a 
concept as the arithmetic mean, at least on its numerical computa- , 
tion. However I must exactly in this connexion turn the attention 
to several points. 

The formula (2) can obviously be written in the form 


( 3 ) M = 


'"i -- + 711 , - 4- .. .■ 4- Ton — Afo 


~N~ 


+ Afo 


— b Af,*), 


where designates any arbitrary nunt>er. This formula can be used 
to sifA{uify the computation of the mean. The nunber IfQ, which it 
is useful to take not too far from M, is called the provisional 
mean. Although it plays no indispensible role in the computation 
of the arithmetic mean, it is of great practical significance in 
deriving the higher characteristics. It is therefore desirable to 
become early acquainted with its use. 

In Table 3 I have given a practical example of the provisional 
mean. Computing the mean of the numbers in the second column, we 
have M * 6196/24 = 258.2 . 

1. in general I shall use such naaies as have nearly the saoie form in 
different languages, and most preferably names borrowed from the Lmtin 
language. 

have not kept the name medium from fear that Anglish readme will 
confuse it with the statistical concept median. ^^It should also be aotec 
that the name dispersion is frequently given to <r . -.-Tr.^ 

2. Here, as in the sequel, I designate the quantity M • hy b. 
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Table 3. 


Nimbar of boyo oar 500 btrthe 
In illfiarniLt provineao of SKvodon in May 1883. 


s — 500, N -- 24, Mo ^ 250. 


k 

ntk 

1 mh - 

- Mo 

(m, — «.)* 

1 

244 


— 6 

36 

2 

243 


— 7 

49 

3 

231 


- 19 

361 

4 

275 

+ 25 


625 

5 

264 

+ 14 


196 

6 

256 

-h 6 


36 

7 

257 

+ 7 


49 

8 

250 

-h 0 


0 

9 

240 


— 10 

100 

10 

266 

4^ 16 


256 

11 

271 

^ 21 


441 

12 

259 

+ 9 


81 

13 

256 

-h 6 


36 

14 

246 


— 4 

16 

15 

263 

+ 13 


169 

16 

246 


4 

16 

17 

267 

+ 17 


2^9 

18 

280 

-h 30 


900 

19 

259 

-h 9 


81 

20 

244 


1 — 6 

36 

21 

252 

+ 2 


4 

22 

282 

-h 32 


1024 

23 

261 

-h 11 


121 

24 

284 

-h 34 


_1156_ 


6196 

-+-252 

— 56 

6078 


Eb^lcving the provisional mean = 250 , we have 


b = (252 — 56) : 24 = 8.2 

and therefore 


Af = 8.2 + 250 = 258.2 


as before. 

The last column of the table is used in computing the disper* 
aion (compare the next chapter). 

With respect to the application of the provisional mean we 
remark that the value of M which we get is the same, whether or 
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not the provisional mean is applied. The computation of all other 
characteristics* however* is not only more easy but more exact 
when Mq is used. For in ordinary calculations M must be rounded to 
a small number of decimals* preventing application of the exact 
value of This cannot happen with and thus the entire 
computation becomes simpler and the result more exact with less 
work. 

Division into classes. 

If the number of elements in the statistical series is very 
large, direct computation of the mean by formula (2) or (3) is 
very laborious. It is then useful to arrange the elements in 
classes* so that the elements of approximately the same size are 
taken together in the same class. For this the class interval (w) 
and the class limits must be properly chosen; their meaning is 
clear from the following numerical example. 

The number of newborn boys in each month for each province 
(i.e. Man’) of Sweden (with the exception of Gotland) was taken 
from the official statistics of Sweden for the years 1883 and 1890* 
and uniformly reduced to the basis of boys per 500 births per 
province per month. The 57^ elements obtained in this way* which 
varied between the lowest value 202 (Jamtland* February 1890) and 
the highest value 300 (Warmland* Sept^ber 1890)* were arranged in 
classes with a class interval w - .5 . 

The class limits were so chosen that all elements between 200 
and 204 were put into one class* all elements between 205 and 209 
into one class* etc. Thus was obtained the following table: 

The first column gives the class limits, the second the corres* 
ponding centre of each class. These columns may generally be 
omitted, since they are sufficiently described by the values M and w 
above the table. The fourth column gives the frequency, i.e. the 
number of elements of the statistical series within each class. We 
find that the number of boys in 500 newborn lies only once between 
the limits 200 and 204 , but 18 times between 235 and 239 , 108 
times between 255 and 259 , etc. The frequency is generally desijgna- 
ted F(x), so that F(x) designates the frequency in a class with the 
mark x. 

The remaining columns of Table 4 are now easily understood. To 
compute the mean a provisional mean is chosen. We here take the 
class 255-259 , in which falls the largest number of elements. Wq, 
the centre of this class, is 257 . The other classes now become -5 , 
-10 , -15 , etc., in the negative and +5 , ♦lO , ♦IS , etc., in the 
positive direction. It is best to take the class interval as a unit; 
the classes are then designated by the numbers given. These numbers 
are called class-marks in the table and designated x. 

3. The class interval should be chosen as a convenient number, such 
that the dispersion is not less than 4w, nor the range than 



Table 4. 


Number o! %oye per 500 birthe In Sweden for eech month 
of the yeere 1883 end 1890. 


5 

=7 500, 

N ^ 576 

, Mo = 257, IV ~ 5. 


Class 1 

Frequency 

X F(x) 

Limits 

SBI 


== h<x> 

pos. 

ncg. 



— It 

1 


— 11 


207 

— 10 

0 



210-214 

212 

- 9 

0 


0 

215—219 

217 

— 8 

1 


— 8 

220 224 

222 

— 7 

2 


- 14 

225-229 

227 

- 6 

5 


— 30 

230 234 

232 

— 5 

13 


— 65 

235 239 

237 

— 4 

18 


— 72 

240 244 

242 

- 3 

47 


— 141 

245 249 

247 

- 2 

60 


- 120 

250-254 

252 

- 1 

81 


- 81 

255 - 259 

257 

0 

108 


0 

260-264 

262 

1 

91 

+ 91 


265 269 

267 

4 - 2 


+ 120 



272 

+ ■3 

44 

-f- 132 


275 279 

277 

+ 4 

22 

4- 88 



282 

+ 5 

16 

4 80 


285 -289 

287 

+ 6 

6 

4- 36 



292 

+ 7 

0 



295 299 

297 

+ 8 

0 




302 

-1- 9 

1 

1 - 9 ..- 

U- 




576 

[ + 556 

j - - 542 


The computation is now very simple. The products xF(x'\ are 
easily found. Their sum, divided by 576 (= ^). gives the distance 
from the provisional to the arithmetic mean. We have then 

b (556 —542):576 = +ao243ii/ = +O .122 

and 


Af = 257 + ft = 257.122, 

where the third decimal may be omitted.^ 


4. Cone^are the chapter on mean error. 
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Cap. III. The Dispersion. 

The dispersion is desiccated o and is defined by the formula 

( 1 ) + + 

It can be computed directly by this I.^rMula or better by using 
the orovisional mean, in sbich case the forimila reads: 

i>%\ _t (*”i — ^o)* + (®*i — ^o)* + • • • + (Ww — Mo)* g,, 

( 2 ) ==-- D , 

where, as usual, b - M • Hq 

Taking the example in Td^le 3 , we have 

a* = 0078 : 24 —( 8 . 2 )* = 186.o 

and therefore 

a = 13.64. 

If the elements are arranged in classes» the confutation--with 
check--runs as follows. I take the same example as in Table 4. 


Table 5. 


Nmnbar of boys par 500 births, 
s = SOU, iV = 576, Af* == 257, w = 5. 


fr + D* 

X 

F(x) 

xF(x) 

x*F(x) 

(4r+l)*F(x> 

100 

— 11 

1 

— 11 

+ 121 

100 

81 

— 10 

0 

0 

0 

0 

64 

— 9 

0 

0 

0 

c 

40 

— 8 

1 

- 8 

+ 04 

49 

36 

— 7 

2 

— 14 

+ 98 

72 

25 

— 6 

5 

— 30 

+ 180 

125 

16 

— 5 

13 

— 65 

+ 325 

208 

9 

— 4 

18 

— 72 

+ 288 

162 

4 

— 3 

47 

— 141 

+ 423 

188 

1 

— 2 

00 

— 120 

+ 240 

60 

0 

— 1 

81 

— 81 

+ 81 

0 

1 

0 

108 

0 

0 

108 

4 

+ 1 

91 

+ 91 

+ 91 

364 

9 

+ 2 

60 

+ 120 

+ 240 

540 

16 

+ 3 

44 

+ 132 

+ 306 

704 

25 

+ 4 

22 

+ 88 

+ 352 

550 

36 

+ 5 

16 

+ 80 

+ 400 

576 

49 

+ 0 

6 

+ 36 

+ 216 

294 

64 

+ 7 

0 

0 

0 

0 

81 

+ 8 

0 

0 

0 

0 

100 

+ 9 

1 

+ 9 

-f 81 

100 



576 

+ 14 

+3506 

+4200 



The expression (2) for the dispersion takes now the following 
form: 


a* 


A2x*F{x) .J 


where h is expressed in class interval units and 5 jc2F(x) signifies 
the sum of all numbers in the fifth column of Table 5 . 

From the table we have 


2x^F(x) = 3596, 

and in Art. 6 we found b = *♦'0.024 w. llierefore 

a* = wM 3596: 576 — (0.024)*} = ii;*6.242 

and 

o = 11 / 2.498 = 12 . 49 . 

The entire computation is carried out in these lines. A direct 
calculation by formula (1)» without using the provisional mean, 
would be the work of a whole day. 

The computation is easily checked with the help of the nunbers 
in the first and last columns of Table 5 • 

For, as is easy to see, 

2(x-\- 1 )* F(x) = 2x* F(x) + 2 2x F(x) + 2F(x). 

The nunnbers in the last line of Table 5 give 

Sx'F{x) = +3596 
2 2x F(x) =r + 28 
2 F(x) = + 576 

-f 4200 = 2 {x+ ly F(x). 


Presupposing that the higher characteristics--and in particu¬ 
lar the skewness and the exceis--are small, the dispersion enables 
us to compute the distribution of the elements of the statistical 
series into the several classes very easily by means of the 
Gaussian law of error. I shall return to this point in the chapter 
on frequency curves and will only mention here one simple property 
of the dispersion; the number of elements of the series between 
the limits M~cr and is about 2/3 of the whole number N. In the 
example at hand the dispersion is about 2^ class intervals. The 
combined number of elements in the classes with the marks -2 , 
-1 , 0 , *•■1 , and *♦'2 is 400 in the table, and 2/3 of 576 is 384 , 
so that in this example somewhat more than 2/3 of the total nunber 
of elements lie between the limits and If-o*. 


17 



Taking the example in Table 3 t where hf = 258.2 , <y = 13.6 , 
we have M^or = 271.8 and M-cr = 244.6 . The number of elements in 
Table 3 between these limits is 16 and exactly equals 2/3 of the 
total number of elements (24). 

For this reason the dispersion is an excellent measure of the 
variancel about the mean. 

On this property is founded the use of the dispersion to judge 
the uncertainty in determination of quantities from statistical 
observations. shall treat this question in the next chapter. 

The dispersion can also be used to determine the limits be¬ 
tween which the elements of a statistical series lie on the 
average. The theoretical results in this connexion are collected 
in the following table. 


Table 6. 


Tha limits/ 
alsmaiit 


either side tha mean/ beyond which an avaraga of one 
(of a statistical aariaa with N alamsnta) will fall. 


N 

Limits 

N 

Limits 

10 

4- 1.65 <r 

300 

+ 2.04 a 

20 

1.06 ar 

400 

3.0 a 

30 

2,l»a 

500 

3.0 a 

40 

2.34 a’ 

600 

3.1 a 

50 


700 

3.2 a 

60 

2.89 

800 

3.2 a 

70 

2.45 a 

900 

3.2 a 

80 

2.50a 

1000 

3.8 a 

90 

2.54a 

10000 

3.9 a 

100 

200 

2.68a 
ic. 2ai e 

100000 

±4.4 a 


Taking the exanple in Table 3 (AT = 24 , = 13.6 , M = 258.2), 
we have, from Table 6 , the limits ±2.03 cr = ±27.6 . We may then 
expect that a single element of the series is smaller than 258.2 
- 27.6 = 230.6 or larger than 258.2 + 27.6 = 285.8 . In fact there 
is no element without these limits, but two which lie near the 
limits. The lowest element is 231 (No. 3), the highest 284 (f^o. 
24); in good correspondence with theory. 

In Table 4 (/V «= 576 , cr = 12.5 , M = 257.1) we have the limits 
if - 3.1 cr *= 218.4 and If ♦ 3.1 cr = 295.8 . Without these limits we 
have among the 576 elements of Table 4 three elements. This is 

The dispersion has many different names in the literature, ine 
expression of Gauss is mittlere Abweichung, PEARSON applies t^hc name 
standard deviation, some German mathematicians say Streuung. The name 
dispersion has the advantage of being understandable in any languap. In 
English the term mean deviation is ill advised, as suggesting rather the 
average deviation (^). The name Streuung, and its equivalent, variance, 
have been used for cr . --Tr.^ 



properly two too many. But firstly Table 6 is valid only on the 
average and secondly we do not yet know whether the higher charac¬ 
teristics may be neglected. However the smallest element (202 boys 
in 500 births, Jamtland, February 1890), which differs from the 
mean by more than four times the dispersion, appears to be errone¬ 
ous. Therefore I applied to Dr Widell, director of the central 
bureau of statistics, who very kindly had the original data for 
this element reread. A few small errors were indeed found, which, 
however, do not materially affect the result.^ 

The average deviation A convenient measure of variance is 
furnished by the average deviation. It is designated 6 and defined 
bv the formula 


'm, —M'-f — Ml + ...4-|m^^ — M\ 


T.hle 7. 


s =:500» N = 


m 

m^2S^ 

244 

14.2 

243 

15.2 

231 

27.2 

Zlb 

i 16 j 

264 

5.8 

256 1 

1 2.2 

2S7 

1.2 

250 

8.2 

240 

18.2 

266 

7.8 

271 

12.8 

250 

0.8 

256 

2.2 

246 

12.2 

263 

4.8 

246 

12.2 

267 1 

8.8 

280 1 

21.8 

250 

0.8 

244 

14.2 

252 

6.2 

282 

23.8 

261 

2.8 

284 

25.8 


where m^-M designates the difference between 
mi and M, taken positive (even if mi is less 
than If). If we take, ex.gr., the series of 
Table 3 , the computation of 0 runs as in 
Table 7 below. Take the differences iw-lf (from 
Art. 5 , If = 258.2) and add them, ignoring 
sign. The sum of the numbers in the second 
column is 266 and therefore 

e ^ 266 : 24 = ll.os. 

From the average deviation, the dispersion 
can be approximately computed by the formula 

0) a = \ ^ 0 = 1.2533 B. 

In the present example, computing or from 
formula (4), we have cr = 13.9 , whereas by di¬ 
rect computation by formula (2) we found 
cr = 13.6 . 


2. It must be remembered in this connexion that 
the series of Table 4 was not directly observed, but 
that the numbers were reduced to the basis of SOd 
births per province. Compare in this respect the 
remarks of Chapter VII. 
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Cap. IV. On mean Error. 

The mean error of a fjuantity x is denoted by e(x) (or briefly 

12 e). mean, that on the average, in a ' tHfe llr 

of x™Jhe value of x is found tao times out of thr« ' 

in, and once out of three to lie without, the limits If-e and 

Here* signifies “V i?i“tLtis^ical series is equal, 

13 .K J^toTe dUper^sion oVthe series; or expressed in a fornula. 


then, 
0 ) 


e(m) = a. 


m 


The mean error of an element of the series in ^ is 

eaual by Art. 8 , to 12.S • This can be expressed (not strictly 
correctly) in words thus: choosing at random (from those Table 
5 ) 500 newborn children, in two cases out of three the 
boys in these 500 will lie between the limits 257 - 12.5 244.5 

and 257 ♦ 12.5 = 269.5 . . l t i 

The mean error of the mean is given by the fornula 


( 2 ) 


— yn 


from which it is seen that the mean error of the mean varies 
inversely as the square root of the number of elements. 

By formula (2) the mean error of the mean of the series 
in Table 3 is 13.64//24 = 2.78 ; that of that in Table 5 is 
12.49//576 = 0.520 . 

It ia customaryl in giving a statistical value to add its mean 
error with the sign ±. Thus the mean number of boys in 500 births 
from Tablfe 3 is written 


258.2 4:2.78 


and from Table 5 


15 


( 3 ) 


257.12 ± O.W. 

The mean error of the dispersion (a) is given by the formula 

a 


e(a) 


YzW 


Applying this formula to the exairples of Tables 3 and 5 , the 
dispersion with mean error added from Table 3 is written 


1. <Ainong English writers, 
probable error, a quantity defined 


the sign ± is frequently used with the 

by P.E. = 0.67449 e. --Tr.> 
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and from Table 5 


O = 13.64 4- 1.97 


12.49 4- 0.367. 


Notice that the determination of cr (and also of M) from Table 
5 is significantly more certain than from Table 3 . This is caused 
by the different number of elements. The difference between the 
two values is smaller than the mean error. Compare Art. 17 . 

10 mean error of the average deviation (0) is given by 

(4) e(gf = Vn — 2 = 1.obb5— ;£=:- 

VJJf V2N 

In the example of Art. 11 we have & = 11.06 and /V = 24 . and 
therefore, with the mean error appended, 

d =r 11.08 ± 1.71. 

The mean error of the sum of two observed quantities a and b 
is given by the formula 
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(5) e(a + b) = y^{a) + e»(>) 

The mean error of the difference between a and b is obtained 
from the formula 

(6) eia — b) = ye*(a)+ «»(*) 

The mean error of the sum is thus equal to the mean error of 
the difference. 

The mean error of a multiple ka of a is 

( 7 ) s{ka) = ke(a). 

From (6) and (7) follows the more general formula 

(7*) f (fci a + fc, 6) = V K*e*{a) + Ve* (b). 

As example for (7) let us compute the mean error of the 
probability of birth of a boy. This is obtained from Table 3 or 5 
by dividing by the number of comparison (s), or by multiplying by 
1/s. Taking the larger series (Table 5), we find by Art. 14 the 
value 0.51424 ± 0.00104 for the probability of birth of a boy. 

. The nuihber of decimals in the ntnerical value of a quantity is 
> best determined from its mean error. It is convenient to proceed 
according to the following rules: 
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1. Express the mean error with three (possibly two) significant 
figures; 

2. give the quantity itself with the same accuracy (the same 
number of decimals) as its mean error or possibly with one 


decimal fewer. 

All numerical values in this chapter are given in accordance 
with these rules. 

If on two different occasions the values a and b have been 
found for a characteristic of a statistical series, the corres¬ 
pondence between the two values is called good if the difference 
between a and b is numerically smaller than the mean error of this 
difference, given by forntila (6). The correspondence is satisfac¬ 
tory, if the difference a-b remains less than twice (exceptionally 
three times) its mean error. If the difference a-b were to 
increase over three times (possibly twice) its mean error, the 
correspondence must be considered leas good. In general there is 
then occasion to hope that a plausible explanation for this 
deviation can be discovered. Only once in 27 000 cases does a 
value of the difference a-b accidentally occur that is greater 
than four times its mean error. 


Cap. V. Bernoulli’s Theorem. 


The simplest statistical Series. 

We assume that s cards are successively drawn from a pack of m 
black and n red cards, the card drawn being replaced in the pack 
after each drawing. In these s drawings are drawn say m| black 
cards in all. The experiment is repeated N times; let 


(I) m,, m,, m 

S»> • • •» niN 

be the number of black cards obtained in these. trials (^each 
trial encompassing s simple drawings). Then the numbers (1) form 
what I call a Bernoulli series or the sinplest statistical series. 

The characteristics of this series can be calculated from 
Bernoulli’s theorem [s] . Let p designate the ratio of the number 
of black cards to the total nunber of cards and q the ratio of the 
nunber of red cards to the total number of cards, so that 


\^) P ~ ~ ^ > 9 == 

/n -f- II m -f- n 

and therefore pfq - 1 . The nisnbers p and q are called respective¬ 
ly the probabilities for the drawing of a black and of a red card. 

Bernoulli’s theorem states that the mean of (1), which we 
shall call the Bernoulli mean and designate is obtained from 
the formula 
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( 3 ) 


Mb — sp 


and that the dispersion of (1), which we shall call the Bernoulli 
dispersion and designate (Tp, is given by the formula 


( 4 ) 


Ob = Vspq. 


The correctness of formulae (3) and (4) can easily be tested 
by experiment, although it is rather laborious to collect material 
in sufficient quantity. I have succeeded in obtaining, by the kind 
collaboration of some friends, a collection of 12 600 of these 
drawings from an ordinary pack of 52 cards, 10 000 of which are 
designed to illustrate Bernoulli's theorem. The others are applied 
in the following chapters. 

These drawings may be arranged in groups with an arbitrary 
number of comparison (s), and I shall give here the results for 


three values of s: s = 500 , s = 50 , 
and s = 10 , and compare tlie mean and 

dispersion of these series with their 
values (3) and (4) according to Bernoulli. 

23 begin with the case s = 500 . 20 

groups can be formed in all. Since the com¬ 
putation of the characteristics has here a 
very simple form, I give it at length. The 
provisional mean is conveniently taken at 
250 . Table 8 now gives 


ft = (85 — 152) : 20 = —3.35 
and therefore 


M = 246.65 ± 3.09. 


Tlie mean error is computed by formula 
(2) of the preceding chapter. 

Moreover we have 


a* = 4045 :20 — b^= lOl.oa, 


so that, with mean error added, 
a = 13.82 2.18, 


Table 8. 


Number (m) of black cards 
in 500 drawings. 

s = 500, N = 20, Mo = 250. 


m 

Iff — Mo 

(m-M,)* 

252 

+ 2 

4 

235 

— 15 

225 

248 

— 2 

4 

ZIX 

-1-21 

441 

260 

+ 10 

100 

246 

- 4 

16 

228 

— 22 

484 

229 

— 21 

441 

234 

— 16 

256 

250 

0 

0 

271 

+21 

441 

234 

— 16 

256 

258 

+ 8 

64 

233 

— 17 

289 

Z13 

+23 

529 

244 

— 6 

36 

249 

— 1 

1 

241 

— 9 

81 

231 

— 19 

361 

246 

- 4 

16 

, „ , J 

+85 —152 

+4045 


-23 
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We will now compare these values with those given by tne 
formulae of Bernoulli* Because in all these trials p - <7 - ^ » we 
have from (3) anH (4) 


Mb= 500x Vt = 250, 


Ob = KSOO X Vi X Vi = ll.w. 


It is seen that M is somewhat too small and a somewhat too 
large. The correspondence, however, in the terminology of Art. 
19 , is satisfactory, since the difference is in no case more than 
E4 times the mean error. 
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Table 9. 

Nmnlisr (m) 
of black cards 
la 50 drawings. 

s = 50, iV = 200, 


We next collect the trials in groups of 
50 drawings, i.e. s = 50 • Obviously 10 000 
drawings divide into 200 of these groups. It 
is now advantageous to arrange the elements 
in classes. The result of the experiment i^ 
to be seen in Table 9 : once were 14 black 
cards obtained in 50 drawings, no times 15 
black cards, twice 16 cards, etc. 

The computation of the characteristics 
proceeded as per the instructions of Art. 8 
and the result, with mean error added, was 


iW. = 25, w = 1. 


m 

X 

F(x) 

14 

—11 

1 

15 

—10 

0 

16 

— 9 

2 

17 

— 8 

2 

18 

— 7 

4 

19 

— 6 

8 

2D 

— 5 

6 

21 

— 4 

15 

22 

— 3 

13 

23 

— 2 

15 

24 

— 1 

34 

25 

0 

14 

26 

+ 1 

21 

27 

+ 2 

26 

26 

+ 3 

14 

20 

+ 4 

10 

30 

+ 5 

5 

31 

+ « 

5 

32 

+ 7 

3 

33 

+ 8 

2 


200 


M = 24.065 4: 0.248, 
a = 3.510 4: O.iTO^ 


vdiereas formulae (3) and (4) give: 


Mb= 25, 

Ob = 3.560. 


The obtained dispersion lies within the 
limits of its mean error, while the mean is 
here again somewhat too small.^ The dif¬ 
ference, however, is again less than i.5e. 

1. The wean cooputed will deviate in the same 
direction froB Wo. no matter how the experiments 
are grouped. 



If the trials are arranged in groups of 10 drawings* the 
following result is obtained. Because no fewer than 1000 elements 
are obtained here, division into classes is necessary. I give here 
the complete computation, which is not laborious. 


Table 10. 

Number (m) ol black cards in 10 drawings. 


5 = 10, N = 1000, M. = 5, w = 1. 


(X+O* 

m 

X 

F(jf) 

xFlJr) 

x^F{x) 

(X + D* FW 

16 

0 

— 5 

3 

— 15 

+ 75 

+ 48 

9 

1 

— 4 

10 

— 40 

+ 160 

+ 90 

4 

2 

— 3 

43 

— 129 

+ 387 

+ 172 

1 

3 

— 2 

116 

— 232 

+ 464 

+ 116 

0 

4 

— 1 

221 

— 221 

+ 221 

0 

1 

5 

0 

247 

0 

0 

+ 247 

4 

6 

+ 1 

202 

+ 202 

+ 202 

+ 808 

9 

7 

+ 2 

115 

+ 230 

+ 460 

+ 1035 

16 

8 

+ 3 

34 

+ 102 

+ 306 

+ 544 

25 

9 

+ 4 1 

9 

+ 36 

+ 144 

+ 225 

36 

10 

+ 5 

0 

0 

0 

0 


1000 

— 67 

+2419 

+3285 


Check: 

Sx^F(x) = +2419 
22x Fix) =r — 134 
2 Ffr >=:+1000 

+ 3285 


Hence we have 

b = —67:1000 = —O.0C7, 

and so, with mean error, according to Art. 14 , 
M =: 5 — O.0»7 = 4.933 + 0.05O 

«• = 2.419 :1000 — = 2.415, 

0 = 1.SS4 ± 0.035. 


and, since 


For the Bernoulli mean and the Bernoulli dispersion 
from (3) and (4) the values 


we have 
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Mb= 5. 

Og = 1 ^ 1 . 

The agreement is as good as that of the previous example. 


Cap. VI. The Theorems of Poisson and Lexis. 

Poisson’s theorem. We now assume that in our trials--each s 
^drawings from a pack of cards*-the ratio between the numbers of 
black and of red cards varies from drawing to drawing. Let be 
the probability of getting a black card in the first drawing, 
the correspondiaif probability in the second, pj in the third, etc. 
Likewise let q^, < 72 * ^3 corresponding probabilities of get¬ 

ting a red card. 

The trial is repeated N times, changing the composition of the 
pack in the same way within each trial of s drawings. 

If m 3 , ..., nijif are the number of black cards ob¬ 

tained in each of these trials, then I say that the numbers (the 
elements) 

(») • • •> 
form a Poisson series. 

The mean (Kp) ®nd the dispersion (o'p) of this series were first 
conputed by Poisson and have the following values: 


(2) Mp = Pi + Pi + Ps + ’ • • + P* = ^pkf 

(3) aV = Pi Vi +/>f + A + • • • +PW. = 2pkqk^ 

We will designate the arithmetic mean of the s values* of p by 
P(j, that of the q by qq, so that 

_ Pi 4* Pi + Pi 4* • • • + Pi 

Po -^-* 

w 

Q = 4 ^1 4 gs 4 *«■ + gi 


If all drawings had been performed with the constant probabi¬ 
lities p 0 and Q 0 , Bernoulli’s theorem would yield 


(5) Mb = spo, 

(®) 

(7) Mp = Mb. 
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Hence follows: if the s drawings had been performed all with 
constant probability of black card po* instead of with the varying 
probabilities pj, P 2 f P 3 , ..•* Pa» the mean would have been the 
same as in the Poisson series. 

For the dispersion, however, is found, after a short calcula¬ 
tion, the formula 
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(8) Op* = ag* — 2(p,, — Po)\ 

SO that the dispersion of the Poisson series is always smaller 
than the Bernoulli dispersion corresponding to the probabiI ity po* 
It can however be proved that the difference between these 
values of dispersion is generally insignificant. 

I have chosen the following experiment to illustrate Poisson’s 
theorems (7) and ( 8 ). From a pack of cards with 13 cards of each 
suit one card was drawn at random and the colour noted. Before the 
next drawing a spade was removed and replaced by a heart from an¬ 
other pack, so that the pack consisted of 12 spades, 13 clubs, 13 
diamonds and 14 hearts. From this pack a card was now drawn and 
the colour noted. Then again a spade was removed and replaced by a 
heart, whereupon a new drawing was made. This procedure was con¬ 
tinued until all the spades had been removed and replaced by 
hearts. Then this operation was continued with the clubs, which, 
after drawings, were replaced one by one by diamonds. In this man¬ 
ner 27 (= s) simple drawings were obtained. 

These 27 drawings together form one trial. In all 100 (= /V) 
such trials were made, consisting each of 27 (= s) drawings ac¬ 
cording to the plan above. The result and the computation of the 
characteristics proceeds from the following table. 

We obtain hence b = +0.16 and. with mean error. 


30 that 


M — 7.16 ± 0.194 

a*= 3.78-(O.is)* = 3.754, 

a — 1.937 Jr 0.138. 


According to (2) and (3) the values from the Poisson theory 

are 


Mp — 6.75 

Op = 2 . 111 . 


The agreement of the mean is not particularly good; that of 
the dispersion is satisfactory. 


-i7 
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Table 11. 


28 


PoisBon seriBB. NumbBr (m) of black cardB in 27 (s) drawingB. 


The probability varies within each trial (from one drawing to the next), 
but not from one trial to the next. 

S = 27, N = 100, Mo = 7, w = 1. 


(x+l)« 

m 

X 

F(x) 

xF{x) 

x>F(x) 

(x+l)>F(x) 

Check 

9 

3 

1 

— 4 

2 

_ 

8 

+ 

32 

18 

♦ 


4 

4 

— 3 

6 

— 

18 

+ 

54 

24 



1 

5 

— 2 

14 

— 

28 

4- 

56 

14 


378 

0 

6 

— 1 

14 

— 

14 

4- 

14 

0 

4- 

I 

7 

0 

22 


0 


0 

22 

+ 

32 

4 

8 

+ 1 

17 

+ 

17 

4- 

17 

68 

4- 

100 

9 

9 

+ 2 

14 

4- 

28 

+ 

56 

126 

4- 

510 

16 

10 

+ 3 

8 

+ 

24 

+ 

72 

128 


25 

11 

-f 4 

1 

4- 

4 

+ 

16 

25 



36 

12 

+ 5 

1 

4- 

5 

+ 

25 

36 



49 

13 

+ 6 

1 

4- 

6 


36 

49 




100 


+378 

510 




The arithmetic mean of the probabilities of a black card is % 
(- Pq)- drawings had been performed with this constant 

probability, the result would have been, according to (5) and (6), 

Mb =- 6.75 = 27 X Vi, 

ffB = 2.25 

which numbers obey Poissctn'^s theorems (Af/> = < Tp ~~< cr^). 

Lexis’ theorem. Much more important than thd theorem of 
Poisson, however, is a theorem proved by Lexis. 

We will perform Af trials with drawings from a pack of cards in 
the following manner. In the first trial let the probability of 
getting a black card be pj. Perform s drawings and get black 
cards. In the second trial the composition of the pack is differ¬ 
ent. The probability of getting a black card is now p 2 * Make s 
drawings again and get m 2 black cards. In this way hf trials are 
made, where the composition of the pack changes from trial to 
trial, but remains constant among the s drawings which make up a 
trial. In this manner is produced a statistical series 
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jvhich I call a Lexis series. 

If Ml and crj^ are the mean and the dispersion of this series. 


(9) 


wtiere 


00) 

Po 

and 



pi + + Ps 4- • • • + pN 

N 


<10 Ol* = spo qo + (s* - - s) a,*, 

where 

(12) Op* = (pi Po)* (Pi— ppy 4~ (pi — poY + > * * + ( pN ppy 
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Some important conclusions follow from these formulae. I pre¬ 
sent here only the following of these: 

1. The mean of a Lexis series is the same as that of a Bernoulli 
series with the constant probability Pq. = M^. 

2^ Hie dispersion of a Lexis series is greater than the disper¬ 
sion of the said Bernoulli series, cri^ > cr^* where cr^ = ^^Pq^o* 
3. The ratio o-i^/cru is little larger than 1 for small s, but goes 
to infinity as the square root of s. 

The third of these consequences (or more exactly its first 
part) was first enunciated by Bortkiewicz and was named by him the 
law of small numbers. 

The ratio which I call the Lexis ratio and designate b> 

L--plays a significant role in practical statistics. I take the op¬ 
portunity of giving examples of this in the sequel. Lexis says 
that a statistical series has supernormal dispersion when L ^ 1 , 
normal, when L = 1 , and subnormal, when L 1 . The Lexis series 
defined above always has, by (11) supernormal dispersion, vhereas 
a Poisson series has, by (8), subnormal dispersion. We shall 
find thnt most series in applied statistics have supernormal 
disper sion.1 

Before treating, in the next chapter, some examples from ap¬ 
plied statistics, I shall illustrate the theorem of Lexis by an 
experimental series as I have illustrated Poisson*s theorem. 

10 trials, each consisting of 10 simple drawings, were made 


1. As an example of series with subnormal dispersion, the number of 
twin births in Sweden with the total births as number of comparison may be 
adduced. 
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with an ordinary pack of cards, and the niinber of black cards got 
in each trial noted. Then 10 new trials were made with a pack con¬ 
sisting of 25 black and 27 red cards. Then 10 trials with a pack 
of 24 black and 28 red cards, etc. Of the 270 trials which were 
performed in this way (until the pack consisted only of red cards) 
I take the 100 first, iM^ich give the following result. 


Table 12. 


Lsxis isriss. Number of black cards In 10 drawings. 


The probability is constant within each trial, 
but varies from one trial to the next. 


s = 10, N = 100, Afo = 4. 


ix + l)» 

m 

a 

Fix) 

xFix) 

x^Fix) 

(x+l)^F{x) 

Check 

4 

1 

— 3 

4 

— 12 

+ 36 

+ 16 

+ 294 

1 


— 2 

0 

— 18 

+ 36 

+ 9 

+ 76 



— I 

19 

— 19 

+ 19 


+ 100 

] 


0 

21 

0 

0 

+ 21 

1 Anti 

4 


+ 1 

23 

-t-23 

+ 23 

+ 92 


9 


+ 2 

10 

+ 20 

+ 40 



16 


+ 3 

12 

+ 36 

+ 108 

+ 192 


25 

8 

+ 4 

2 

+ 8 

+ 32 

+ 50 




+ 38 

+ 294 

+ 470 



Hence we have 5 = +0.38 , and therefore 
M = 4.38 ± 0.167, 

or* = 294:200— t* = +2.796, 

•o that 


a = + 1.672 + 0.118. 

Tlie mean probability (pq ) for all trials was 

= 21 . 50:52 = 4 . 4135 , 
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from ^ich we have 


Mb= SPo = 4.195, 

Ob == V spo qo = 1.557. 

Computing the values according to lex is from (9) and (11)-- 
computing the dispersion is somewhat lengthy--we have 

Ml= 4.135, 

Oi = 1.643. 

The agreement between the observed values and those of Lexis 
is satisfactory. Comparing the latter with the values according 
to Bernoulli , we find theorems 1. and 2. above confirmed. 

The Lexis ratio (L) has the value 1.06 . The series has there¬ 
fore a very slightly si 4 >ernormal dispersion. 

Arranging all the data into 27 trials, of 100 drawings each, a 
series is obtained for ix^ich L 3.82 . 

The dispersion is here supernormal in significantly high de¬ 
gree, in agreement with theorem 3. of Art. 29 . 

In closing it must be noted that by a combination of the 
theorems of Poisson and Lexis it is possible theoretically to re¬ 
produce any arbitrary homograde statistical series. 

Cap. VII. The observed statistical Series. 

I Take as given a series of numbers which express the number ot 
newborn children in a certain country in various years. Let these 
numbers be m^, m 2 , m^, ..., m^. Suppose moreover for simplicity's 
sake that the number of inhabitants has remained constant during 
the period observed. Then the ratios mi/s, m 2 /s, m^/s, ..., mjy/s 
may be considered as the observed probabilities of the birth 
of a child in the years in question. This identification of a 
statistical quantity with a mathematical probability is only an 
analogy. Very possibly it has little connexion with the observed 
statistical phaenomena; but closer investigation shews the great 
significance in statistics of such a train of thought. 

The next course is to consider the numbers of the statistical 
series mj, m 2 , m 3 , __ m/v as analogous to the number of ‘favour¬ 

able* cases in iV trials, where each trial consists of s drawings 
(from a pack of cards); all these performed with the constant 
probability Pq, approximately expressible by the arithmetic 
mean of the empirical probabilities /wj /s, /n 2 /s, ..., mj^/s. 
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The dispersion and the other characteristics of the observed 
statistical series are then given by BERr^ouLLi’s theorem. The 
founders of mathematical statistics regarded the identification of 
an observed statistical series with a Bernoulli series almost 
axiomatically evident. Laplace likewise was led by this identifica¬ 
tion to incorrect conclusions, inter al, with respect to the 
census of popvulation in France \diich he instigated. 

Only after the appearance of Lexis was the untenability of 
this way of thinking clearly realised, causing a clearer insight 
into the nature of statistical series. 

32 I shall study in this chapter some examples of statistical 
series, which have all been taken from the official statistics of 
Sweden. All figures given here have been reduced, unless otherwise 
stated, to a oopulation of 5 million men as basis, so that the 
number of comparison s = 5 000 000 . This reduction requires a 
small correction which will be treated in the next chapter. It is, 
however, insignificant in the examples at hand, and we shall 
assume the figures given as directly observed for a population of 
5 000 000 men. 

We treat first the question of the number of new-born children 
in Sweden in a year. In the interest of clarity I give in this 
case the conputation of the mean and dispersion in full detail. 

b == (53 190 — 50 390):20 = 140. 

Af= Mo + 6 = 140140. 

= 654 401 400: 20 — ** = 32 700 470, 

80 that 

a = 5718. 

The enpirical probability of birth (pq) is 
Pa = M :s = O. 0 Z 803 , 

therefore we have 

go = 1 — Po = 0.OTI97 
and the Bernoulli dispersion is 

Ob = Vs Pa go = 369 . 0 . 

The actually observed dispersion (5718) is therefore much 
larger than the Bernqulli dispersion. The series of births is sig-* 
nificantly supernormal. The Lexis ratio (L) has the value 

L = 5718 : 369.0 = 15 . 50 . 
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Table 13. 


Number (m) of births in Sweden, 
(rounded to tens), 
s = 5000000, N — 20y Mo 140000. 


Year 

m 

m - 

-Mo 

a 

1 

1881 

145230 

1 

1 -f 5230 

1 

1 

27352 900 

1882 

146640 

+ 6640 


44 089 600 

1883 

144 320 

+ 4 320 


18662 400 

1884 

149360 

+ 9360 


87 609 600 

1885 

146600 

+ 6600 


43 560 000 

1886 

148 270 

+ 8270 


68 392 900 

1887 

148020 

+ 8020 


64 320400 

1888 

143680 

+ 3680 


13 542 400 

1889 

138 300 


— 1700 

2890000 

1890 

139 600 


— 400 

160000 

1891 

141 070 

i +1070 


1 144 900 

1892 

134830 


~ 5 170 

26 728 900 

1893 

136 540 


— 3460 

119 i i 600 

1894 

134840 


— 5 160 

26 625 600 

1895 

136820 


— 3180 

10112 400 

1896 

135330 


— 4 670 

21808 900 

1897 

132 750 


— 7250 

52 562 500 

1898 

134 820 


— 5 180 

26 832 400 

1899 

131320 


— 8680 

75 342 400 

1900 

134 460 


— 5 540 

30 691 600 

' 

+53190 

—50 390 

654 401400 


Table 14 gives the numl)ers of deaths, drownings. suicides, 
mar ri ages, and divorces in Sweden during the years 1876-1900 . 

The values of the mean (^), of the dispersion (cr), of the 
Bernoulli dispersion and of the Lexis ratio (L), are col¬ 

lected in Table 15 . 

In all these cases supernormal dispersion is found, the most 
in the number of deaths, the 1 east--although still significant--in 
the number of divorces. 

Dispersion which is not supernormal is met extremely seldom in 
^ "applied statistics. The classic example for a series in which 
normal dispersion is ordinarily expected is the number of new-born 
boys, with the total number of births as number of comparison. We 
found in Art. 8 from observations in Sweden in the years 1883 and 
I8S0 
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Table 14. 

SerisB from the official atatistics of Swadan. 
s == 5 000000, iV = 25. 


Year 

Deaths 

Drownings 

Suicides 

Marriages 

Divorces 

1876 

97 450 

1349 

462 

35 200 

239 

1877 

92 740 

1261 

479 

34 200 

235 

1878 

89830 

1239 

453 

32160 

226 

1879 

84 240 

1219 

478 

31270 

225 

1880 

90 630 

1349 

421 

31 670 

238 

1881 

88 360 

1164 

420 

30950 

234 

1882 

86700 

1564 

526 

31630 

213 

1883 

86330 

1279 

510 

31980 

237 

1884 

87 280 

1236 

464 

32 510 

259 

1885 

88 390 

1169 

494 

33000 

245 

1886 

82 730 

996 

601 

31940 

240 

1887 

80 500 

1256 

541 

31 170 

246 

1888 

79850 

1007 

595 

29 560 

265 

1889 

79720 

1166 

561 

29820 

251 

1890 

85500 

1151 

637 

j 

29900 

309 

1891 

83 910 

1040 

638 

29090 

287 

1892 

89350 

950 

708 

28 440 

329 

1893 

83980 

979 

698 

28 210 

304 

1894 

81520 

123i 

791 

28580 

300 

1895 

75600 

1034 

754 

29200 

310 

1896 

77850 

1275 

739 

29600 

352 

1897 

76410 

1029 

760 

30160 

348 

1898 

75 000 

1182 

718 

30510 

404 

1809 

87 970 

1133 

774 

31 100 

380 

1900 

83850 

1026 

777 

30640 

394 


Table 15. 



Af 

<r 


L 

Deaths... 

. 84630 

5427 

288 

18.81 

Drownings . 

. I 171 

140 

34 

4.10 

Su icides .. 

. 600 

124 

24 

5.08 

Marriages. 

. 30900 

1707 

ns 

9.74 

Divorces.. . . 

. 281 

55.f, 

17 

3.81 
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5 = 500, iV = 576, 

M = 257.12, a — 12.49, 


so that the probability (pq ) of the birth of a boy was 


Pq = M :s = 0 . 51424 . 


The corresponding Bernoulli dispersion is 

Cb = 11 .* 8 , 

somewhat smaller than the observed dispersion (12.49). The Lexis 
ratio IS * V / 

L — 1.117. 

Tlie value of L is so close to unity, that it may be doubted 
whether the deviation is to be considered accidental or not. To 
decide this, the mean error of L must be known. I shall omit here 
the fornula for this mean error and comnunicate only the result in 
this case. L can be written, with mean error added: ^ 

L — 1.U7 ± 0.033. 

Since the deviation from unity is somewhat more than three 
times the mean error, this series can be considered supernormal, 
although in a slight degree. 

We can therefore conclude in accordance with lexis* theorem 
that the determination of the sex of the new-born can not entirely 
be compared with a lottery, but that outer influences are also at 
work here. 

In order to aid in the investigation of these outer influences, 

I have arranged the data on birth of boys partly according to time 
(months and years), and partly according to space (different pro¬ 
vinces). It seems that the former division (according to time) 
gives a normal series--in one case even a subnormal series--, but 
that a supernormal series is obtained in the division according to 
provinces. It may be hence concluded that the probability of the 
birth of a boy does not essentially vary with the time (as far as 
the present investigation goes), but rather with the place. I con¬ 
clude that the sex ratio in the number of births may be a racial 
character--using the word race in the widest sense. 

1. Because of the relations treated in the next chapter, however, the 
value of L must be reduced somewhat. 
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Among the few subnormal statistical series that I have met 
lies the number of births of twins with the number of single 
births as number of comparison (in Art. 33 the whole population 
was used as number of comparison). From the official statistics of 
Sweden for 1883 I obtained, by comparing the births of twins in 
the various provinces, L = 0.78 ± 0.11 . The mean error of L is 
too large to exclude the possibility of an accidentally small dis¬ 
persion in the number of births of twins in this year. If, 
however, by analysis of more comprehensive data, it were estab¬ 
lished that the number of births of twins formed a series with 
subnormal dispersion, the conclusion could be drawn that the 
fluctuations in the individual probability of birth of twins have a 
significantly greater influence on the dispersion than the fluctua¬ 
tions which stem from differences in the make-up of the population 
of the provinces. 

It is in the nature of things, as a mathematical analysis of 
the theorems of Poisson and of Lexis shews, that statistical 
series with subnormal dispersion are significantly more difficult 
to realise than series for which L > 1 . 

We found in Art. 32 that for new-born in general the Lexis 
ratio had the value 15.5 . Observing instead the number of births 
of twins, we obtain (with M = 2015 , cr = 95.2 , ctjj = 44.9) the 
value 2.12 for L, a significantly smaller L than for births in 
general. 

This indicates an incompleteness in the Lexis ratio as a 
measure of disturbing influences to which a statistical event 
is exposed. Obviously the factors which operate to disturb the 
probability of birth operate in nearly the same degree upon the 
probability of birth of twins; nevertheless the lexis ratio was 
seven times as large in one case as in the other. 

As may be seen from the theorem of lexis, this incompleteness 
is caused by the fact that the number L is dependent upon Pq and 
that pQ is smaller for birth of twins than for single birth. 

Another defect of L as measure of disturbing forces is that it 
is dependent upon the nurrber of comparison, varying roughly as /s. 
Taking for example the number of newborn in the city of Lund in 
the years 1882-1901 , with s » 16 000 inhabitants as number of 
comparison, we have 


M = 402, a = 24.9, ob = 20.o 

and 

L — 1.24. 

The numbers of births in Lynd form, therefore, an almost 
normal series, •whereas we have found for Sweden as a whole 
L = 15.5 . 

Obviously, however, the influence of outside disturbances on 
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the number of births in Lund is about as strong as in the country 
generally. 

For these reasons I have introduce^, in place of the Lexis 
ratio, or more properly in addition to it, a quantity which is 
free from these defects and is therefore better suited as measure 
for the intensity of outside .infuences. I designate this quantity 
p, nutting 


( 1 ) 


Q = 


V a- — a*B 

M 


and call lOOp the coefficient of disturbancy of the statistical 
series. Its value--with mean error--for some of the series here 
given is: 


Table 16. 

Values of etatietical coefficients of disturbancy. 


Suicides.... 

Divorces... . 

Drownings.. . .. 

Deaths . 

Carriages. 

Births. 

Births of twins 

Births in Lund. 

Births of boys 




£ 

100/> 

dif ferent 

vears 

5000000 

20.:w i 2.1H, 

» 

M 

5000000 

18.81 ± 2.70, 



5000000 

11.62± 1.67, 

„ 


5000000 

6.00 

w 


5000000 

5.4o±:0.7o, 

n 

»» 

5000000 

4.07 ±0.66, 

w 

V 

5000000 

4.17 ± 0.68, 

w 

n 

16000 

3.67 ± 0.50, 

dif ferent 

provinces 

12 000 

0.06 ±0.14. 


We find confirmed here, as we had already supposed, that the 
outer disturbances are iust as strong upon births of twins 
as upon single births and approximately as strong upon the 
number of births in Lund as upon the number of births in the 

entire kingdom.^ 
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Most disturbed is the number of suicides and the number of 
divorces, least is the number of births of boys (by provinces). 
However, judging from the mean error, disturbing forces must be 
considered present even in this last case. 

After the Presence of disturbing infiuences on the statistical 
aeries has been established in this manner, it is the task of the 
statistician to trace the cause of these disturbances. The theory 
of correlation, which will be treated in the second part of this 
resume, gives the general method for this tracing. 


2. Since p is not significantly greater for the frequency of marriage 
than for the frequency of birth, it is suggested that possibly the same 
disturbing influences, at least partly, affect these two events. This 
method of proof, however, shouL! not be generalised. 
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OQ However, certain conclusions about the nature of the disturb- 
ing influences to which the statistical object is exposed may be 
obtained solely by observation of the statistical series. 

According to the theorem of Lexis these disturbances consist 
of changes in the probability of the occurrence of an event from 
one * trial*, to the next (in this case from one year to the next). 
These changes, insofar as they are regular, may be treated 
mathematically as divided into two principal groups: secular and 
periodic changes.^ The determination of the latter is beset with 
considerable di f Acuity--especial ly if the length of the period or 
periods is unknown--, whereas the secular changes of the funda- 
'mental probabilities may be determined relatively easily. 

39 “ assumed that the fundamental probabilities pj, P 2 # •••# 

Pyv are continuously increased (or decreased) by the same amount 
from trial to trial (ex.^r. from year to year). The elements of 
the statistical series are then on the average increased (or de¬ 
creased) from one trial to the next by the amount s/0, where sis 
the number of comparison. Theory shews that the number s/0 may then 
be computed from the formula 


* ^ — “at? w 


12 


/V(/V*—1) 




As example for the application of this formula we shall 
compute the secular increase of suicides in Sweden from the Agures 
given in Table 14 . The complete treatment of these Agures is to 
be seen in Table 17 . 

Because here 


we have 


12 


== 1300, 


5/3 = 21 125: 1300 = I 6 . 25 , 


so that the number of suicides in Sweden has on the average in¬ 
creased by 16 cases per 5 000 000 inhabitants per year. ^ 

I After the secular change in the elements has been determined, 
its influence may be eliminated from the statistical series and a 
new series, free from secular influence, may be derived. From this 
the periodic (or more generally the oscillatory = not uniform) 

3. Called by LEXIS <15 , p. 33> evolutory (more generally symptomatic) 
and periodic (more generally oscillatory) changes. 

4 . I have proved that, when the outside influences have essentially 
secular character, /3 may be approximately computed, by a simple formula, 
from the coefficient of disturbancy p. 



Table 17. 


Conkputation of the secular incraase of tho nisnbar of suicidos In Smden. 

5 5000000, iV rrr 25, M rr: 600. 


Year 


g 




1876 

1 

462 

— 138 

_ 

12 

+ 

1656 

1877 

2 

479 

• — 121 

— 

11 

+ 

1 331 

1878 

3 

453 

— 147 

— 


+ 

1470 

1870 

4 

478 

— 122 

— 

9 

+ 

1098 

1880 

5 

421 

-179 

— 

8 

+ 

1432 

1881 

6 



— 

7 

+ 

1 260 

1882 

7 

526 

- 74 

— 

6 

+ 

444 

1883 

8 



— 

5 

+ 

450 

1884 

0 

464 

— 136 

— 

4 

+ 

544 

1885 


494 


— 

3 

+ 

318 

1886 

11 


+ i 

_ 

2 

_ 

2 

1887 

12 

541 

- 59 

— 

1 

+ 

59 

1888 

13 

595 

— 5 


0 


0 

1889 

14 

561 

— 39 

+ 

1 

— 

39 

1890 

15 

637 

+ 37 

+ 

2 

+ 

74 

1891 

16 

638 

+ 38 

+ 

3 

+ 

114 

1892 

17 


+ 108 

+ 

4 

+ 

432 

1893 

18 

698 

+ 98 

+ 

5 

+ 

490 

1894 

19 

791 

+ 191 

+ 

6 

+ 

1 146 

1895 


754 

+ 154 

+ 

7 

+ 

1078 

1896 

21 

739 

+ 139 

+ 

8 

+ 

1 112 

1897 

22 


+ 160 

+ 

9 

+ 

1 440 

1898 

23 

718 

+ 118 

+ 10 

+ 

1 180 

1899 

24 

774 

+ 174 

+ 11 

+ 

1914 

1900 

25 

777 

+ 177 

+ 1Z 

+ 

2124 


+21 125 


changes of the fundamental probabilities may be more easily 
studied and determined than from the original statistical series. 

If we have succeeded in determining the mathematical character 
of the secular and periodic influences, we have taken an important 
step toward investigating the foreign factors vdiich disturb the 
statistical object studied- The closer determination of these 
factors, as we have already said, is acconplished by means of the 
theory of correlation. 
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42 Let 
( 1 ) 


Cap. VIII. The reduced statistical Series. 


• • *1 

be a given statistical series with the corresponcing numbers of 
comparison 


Sj, 5)) • • *9 ^AT* 

Multiplying the elements in (1) bjj s/sj, s/sj, s/sj, .... 
s/sN, we obtain a new series 


( 2 ) 


s s s s 

~Z~ ^19 ^29 ^39 • • *9 ■“ 

^2 Oj Sfi 


which is called the reduced statistical series, or tlie series re¬ 
duced to the basis s. 

We shall compare (2) with the series that we would obtain if s 
had been the number of comparison for all the elements. Let 

(3) m,', m,', m/, ..., m/ 

be this series. 

We designate the mean and dispersion of (2) by M and a, those 
of (3) by M and 

Under the hypothesis that the object observed follows the laws 
of Bernoulli, it can be proved that 


(4) 

where 


o — fi o' = /i Vspg, 


(4*) 


/i = 




Although, then, the means of the two series (2) and (3) will 
agree on the average, the dispersions in general will differ. 

Formula (4) gives us the Bernoulli dispersion of the reduced 
statistical series (2). Therefore we must multiply the values of 
o-B that we obtained in the previous chapter, where the number of 
comparison varied from trial to trial, by the factor fi, in order 
to give the correct value of the Bernoulli dispersion. 

The number of comparison s, to ^diich the numbers from the of¬ 
ficial statistics were reduced, was however so chosen there that 
the factor of reduction fi differs only insignificantly from unity. 


40 



We find frorr (4*) that has the value 1 when s is so chosen 

that 



i#e. when s is equal to the harmonic mean of the numbers of 
conparison sj, S 2 # S 3 , It is known that the harrwnic mean 

is always smaller than the aritlimetic mean of the same quantities, 
but that its deviation from the arithmetic mean is in general in¬ 
significant. In practice, then, we may say that the factor of 

reduction is near 1 , if s has a value 
which is near the arithmetic mean of the 
Table 18. numbers of comparison Sj , Sj , s^, ...» Sj^, 

In those examples of official statistics 
5 ~ 5 000000. which were treated in the previous chapter. 



1870 4 430000 I .129 

1877 4 485000 1.115 

1878 4 532 000 I .108 

1879 4 579 000 1.092 

1880 4 566 000 1.095 


1881 4 572 000 l.oiM 

1882 4 579000 1.092 

1883 4 604 000 I.O 86 

1884 4 644 000 1.077 

1885 4 683000 I .068 


1886 4 717000 I .060 

1887 4738000 1.065 

1888 4748000 1.053 

1889 4 774000 1.047 

1890 4785000 1.045 


1891 4803000 1.04i 

1892 4807000 1.040 

1803 4 824000 I.O 86 

1894 4 873 000 1.026 

1895 4919000 I .016 


1.007 

0.998 

0.988 

0.981 

0.97 4 

26.818 


MM 1896 4 963000 

T"t 1897 5010000 

1898 5 063000 

1899 5 097000 

1900 5136000 


s was chosen in this manner. 

Taking for example the statistical 
series of the numbers of deaths, drownings, 
etc. for the years 1876-1900 in Art. 
39 , the numbers given in the official 
statistics of Sweden were reduced to a 
population of 5 0 00 000 (=s), whereas th#» 
actual population (s ) varied between 
4 430 000 and 5 136 000 . We find from Table 
18 that 

Jg— = 26.318, 

Sk 

and therefore 

f^ = Y^.3tB:25 = VTmot = 1.080, 

V'spq is to be multiplied by this number to 
obtain the Bernoulli dispersion. Corres¬ 
pondingly, the Lexis ratios in Table 15 a»-e 
to be lessened by division by 1.076 . The 
supernormal dispersion is not notably 
lessened thereby. 

It is not altogether necessary, how¬ 
ever, to reduce the given statistical 
series of elements with varying numbers of 
comparison in such wi.^^e that is near 
unity, although this reduction often 
has certain advantages. The number of 
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comparison (s) of the reduced series may be chosen ad lib. 
Computing the dispersion of the reduced series in the customary 
manner and comparing it with the Bernoulli dispersion given 
by formula 

(^) Ob = /, Vspq 

we obtain the Lexis ratio (L) and the coefficient of disturbancy 
(p) from the formulae of the previous chapter. As an example I 
shall compute the nunft)er of births of twins per 1000 single births 
from the official statistics of Sweden for the year 1883 . 

The elements of the reduced series are found in the fifth 
column of Table 19 . They give the number of births of twins per 
1000 single births in the 25 provinces of Sweden (Gotland is not 
here excluded). The characteristics (M and cr) of the reduced 
series are 

M = Mo + b = 14 + 16.2:25 =: H.es, 

a = K85.3:25 — b* = V 2.990 = I. 73 . 

FVirther, the factor of reduction has the value 

/l = V^5.758:25 = 1/^0.2303 = 0.490. 

In order to compute the Bernoulli dispersion, we must in 
addition know the probability (p) of birth of twins, which, ac¬ 
cording to Table 19 , is 

p = M:s — 14.S5: 1000 == 0.oi465 

From formula (5) we now have 

Ob = 0.480 y 1000 X 0.01465 X 0.96936 == 1.824. 

The LEXIS ratio (L) has therefore the value 

L — 1 . 73 : 1,82 = 0.96 ± O.is 

where the mean error of L is also given, fhe dispersion is 
subnormal, hut not definitely so. We shall find below a better 
method to compute it. 

[1C The reduced statistical series (2) has--as representing a di- 
rectly observed series with number of comparison s--a conspicuous 
weakness. Assume that one time si * 10 000 drawings, and that an¬ 
other time ^2 ~ drawings are made from a pack of cards, and 
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Table 19 


Reduced eeriee. 


Birthe of twine per 1000 eingle births in different provinces. 

= Number of single births 
WI* =: number of births of twins 
s 1000, N = 25, Afo == 14. 


, s 

m k — — 

ik 


k 

Sk 

m* 


m'k 

m'k — Afo 

(m'» — Af.)* 

1 

6 251 

82 

0.160 

13.1 

— 0.9 

0.8 

2 

4 280 

62 

0 . 2 ft 4 

14.5 

+ 0.5 

0.2 

3 

3328 

48 

0.800 

14.4 

0.4 

0.2 

4 

4 205 

64 

0.288 

15.2 

+ 1.2 

1.4 

5 

7 038 

98 

0.142 

13.9 

— 0.1 

O.o 

6 

5650 

77 

0.177 

13.6 

— 0.4 

0.2 

7 

4895 

71 

0.204 

14.5 

0.5 

0.2 

8 

6 276 

84 

0.159 

13.4 

— 0.6 

0.4 

9 

1 060 

21 

0.943 

19.8 

4- 5.8 

33.6 

10 

4340 

62 

0.280 

14.8 

+ 0.8 

0.1 

11 

6291 

98 

0.159 

15.6 

-h 1.6 

2.6 

12 

10023 

168 

O.loo 

16.8 

4 - 2.8 

7.8 

13 

3668 

65 

0.272 

17.7 

4- 3.7 

13.7 

14 

7886 

106 

0.127 

13.5 

— 0.5 

0.2 

15 

7 201 

114 

0.139 1 

15.8 

4- 1.8 

3.2 

16 

6824 

98 

0.147 

14.4 

+ 0.4 

0.2 

17 

6563 

94 

0.152 

14.8 

4 - 0.3 

0.1 

18 

5039 

75 

0.198 

14.9 

4- 0.9 

0.8 

10 

3 788 

56 

0.264 

14.8 

4 - 0.8 

0.6 

20 

5639 

87 

0.177 

15.4 

4-1.4 

2.0 

21 

6131 

82 

0.168 

13.4 

— 0.6 

0.4 

22 

6308 

87 

0.159 

13.8 

— 0.2 

O.o 

23 

2883 

29 

0.847 

10.1 

— 3.9 

15.2 

24 

3748 

57 

0.267 

15.2 

4 - 1.2 

1.4 

25 

3334 

46 

0.800 

13.8 

— 0.2 

0.0 




5.768 


-hl6.2 

+85.8 


and m 2 black cards obtained, 
nunnber of black cards in s = 


If we wish to confute trom this the 
1000 drawings, we obtain the values 


— /Hi and — //Ij. 

5i Sf 
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45 But these values are obviously not equally valid* The mean 
error of the latter value is indeed ten times as large as that of 
the former. In series (2), however, each element, whatever its 
number of comparison, influences the computation of M and to the 
same extent. 

Hb One is therefore led to consider a series which is so com¬ 
pounded that the element occurs times (Ir * 1 , 2 , • 

N), In this manner is obtained a series with Sj S 2 + +...**• 

elements, which is called the reduced and weighted statistical 
series. To distinguish it, the series (2) may be called the sinple 
reduced series. If M\ and cr^ are the mean and dispersion of the 
latter, M 2 and 0^2 those of the former, then the following theorems 
hold: 

1. On the average Mi - M 2 * 

2. On the average the mean error of Mi is larger than that of ^ 2 * 
For the numerical computation of M 2 and 0^2 we have the 

formulae 


(7) 


Af, = 


s 2 itik 

2s, ’ 


(8) <r, == /, V ^ 21- (m, - po Sty, 

•where Pq is given by (11) and 


(9) 



It can be proved that (in formula (4*)) is always smaller 
than f2' 

It is most convenient, however, to compute 0-2 from the average 
deviation given by the formula 


( 10 ) 0=A'-^2lmt—PoStl 

where 

(11) Po = 2mt:SSt = M:s 

and where, as usual, that all differences be¬ 

tween /Hj^ and Po^k to be taken positive. Then 
(!!•) Os = 1.25SS 0 

It is noteworthy, as is seen from formulae (7) and (10), that 
both the mean and the dispersion of the reduced and weighted 
series may be had directly from the observed series 


^19 • • •> 



without first constructing the reduced series. The computation of 
the characteristics of the reduced and weighted series is there¬ 
fore very simple. 


Table 20. 


Raduced and walghtad aariaa. 

Birtha of twina par 1000 aingla birtha in diffarant provincaa. 


Sk = number of single births 
nth = number of births of twins 

s = 1000, N 2b. 


k 

Sk 

ntk 

poSk 

1 nik — Po 3ft 1 

1 

6251 

82 

91.0 

9.0 

2 

4 280 

62 

62.3 

Oj) 

3 

3328 

48 

48.5 

0.6 

4 

4205 

64 

61.2 

2.8 

5 

7038 

98 

10^6 

4.6 

6 

5650 

77 

82.3 

5.8 

7 

4895 

71 

71.8 

0.S 

8 

6276 

84 

91.4 

7.4 

9 

1060 

21 

15.4 

5.6 

10 

4340 

62 

63.2 

1.2 

11 

6291 

98 

91.6 

6.4 

12 

10023 

168 

145.9 

22.1 

13 

3668 

65 

53.4 

11.6 

14 

7886 

106 

114.8 

8.8 

15 

7 201 

114 

104.8 

9.2 

16 

6824 

98 

99.4 

1.4 

17 

6563 

94 

95.6 

1.6 

18 

5039 

75 

73.4 

1.6 

19 

3 788 

56 

55.2 

0.8 

20 

5639 

87 

82.1 

4.9 

21 

6 131 

82 

89.3 

7.8 

22 

6308 

87 

91.8 

4.8 

23 

2883 

29 

42.0 

13.0 

24 

3748 

57 

54.6 

2.4 

25 

3334 

46 

48.5 

2.6 


132649 

1931 


135.8 
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As example for the 
series that was treated 


By formula (11) we 


Table 20 


application of these formulae I take the 
in Art. 44 as a simple reduced series, 
find from the second and third columns of 


4S 



47 


Po » 1031:132640 = O.01458 


as relative probability of birth of twins. Multiplying pq by the 
numbers Sk we obtain the numbers of the fourth column. We further 
obtain 


and 


Af, = 1000Po= 14.56 


V 25x1000 
132 649' 


= 0.4541 


and by (10) 


p = (0.4341)* X 135.3:25 


so that by (!!♦) 


a, = 1.278‘). 


Comparison with the results reached for the same series as a 
simple reduced series in Art. 44 shews that the means correspond 
well and that <r2 ^ ^1* 

48. The Bernoulli dispersion for the reduced and weighted series 
is obtained from the formula 


— ft V ^ Po 


so that in this example 


Ob = 0.4341 1000 X 0.01456 X 0.98544 = 1.644, 

SO that the Lexis ratio has the value 


L = 1.278: 1.644 = 0,777 


as 
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I already stated in Art. 35 . 

For the r.-ean error of M 2 we obtain the value 


e{Mt) == 


< 7 , _ 1.278 

yw ~ vw 


= 0.256, 


whereas the arithmetic mean Mi of the simple reduced series 
(computed in Art. 44) has the mean error 

1. Notice that crt signifies the observed dispersion. 
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= 0.346 


50 


e(M0 = 


vw 


1.73 

yw 


so that, in correspondence with theory, #2 is more reliable 
than Afja 

As one realises from the example treated abovei the reduced 
and weighted series is convenient to use in numerical conputation. 
The sums and are, and were already in the days of 

Sl^ssmilch, customarily given in statistical tables, and a good 
part of the computation is already done thereby. 

Nevertheless the simple reduced series may be preferred when 

1. The factor of reduction (or is near unity, and when 

2 . the number of elements is so large that a division into 

classes must be undertaken. 

It is further to be remarked that the theoretical excellence 
of the reduced and weighted series is only proven for series with 
normal (or nearly normal) dispersion. For series with supernormal 
dispersion--^and these form the largest part of the series in ap¬ 
plied statistics--more exact characteristics can be derived, 
although even for these the reduced and weighted series serves 
well. However, whether the simple reduced or the reduced and 
weighted series is used, it is well not to proceed from elements 
whose numbers of conparison (Sj^) differ too widely. 
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Part Two. 


Hetert^rade or qualitative Statistics. 


Cap. IX. Introduction. 
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In the previous chapters I have treated the two first charac- 
teristics**the mean and the diapersion--of a homograde statistical 
series, and shewn how their numerical values may be derived from 
the elements of the series. Also I have displayed, by means of ex¬ 
amples from official statistics, the significance of the theorems of 
Bernoulli, Poisson, and Lexis for the interpretation of the values 
found for the characteristics. 

The essence of our investigations was to obtain a measure of 
the disturbing outside influences upon the statistical object, in¬ 
fluences which in general cause the fluctuations of the numerical 
value of the elements of the observed statistical series to fail 
to obey those simple laws abstracted from cases of definite draw¬ 
ings of cards from a pack and from such other experiments as lead 
to the Bernoullian laws of probability. 

To continue our treatment of homograde statistics it were 
necessary 

1. to investigate the higher characteristics of statistical 
series, 

2. to treat the connexion between simultaneously presented 
homograde statistical series, i.e. the theory of correlation 
for homograde quantities. 

In order to save space, I shall treat these two questions, as 
well as the related problem of frequency curves, together for the 
two parts of statistics^ This juxtaposition is practicable after 
the theory of Lexis and the properties of the reduced series have 
been treated , since they find application only in homograde 
statistics. I shall however present in the sequel several points 
at which the homograde statistical series requires special 
t reatment. 

Before going on let us observe an example of a heterograde 
statistical series. I take the figures for rainfall in Lund for 
the years 1899-1908 . 

The numbers in the second column of the table give the rain¬ 
fall in millimeters for each of the years 1889-1908 . Each of 
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Table 21. 

Rainfall in Lund. 


iV = 20, Afo = 600. 

X = total annual rainfall in millimeter... 


Year 

Bf 


(X- A!.)» 

1889 

567 

— 33 

1 100 

1890 

594 

— 6 

0 

1891 


-f 120 

14 400 


615 

+ J5 

200 

1893 

625 

+ 25 

600 

1894 

724 

-h 124 

15400 

1895 

673 

+ 73 

5300 

1896 


4* 103 


1897 

648 

+ 48 

2300 

1898 

728 

+ 128 

16400 

1899 

511 

— 89 

7 900 


661 

+ 61 

3 700 


597 

— 3 

0 


541 

— 59 

3500 



+ 63 

4000 



— 37 

1400 


607 

+ 7 

0 


576 

— 24 ! 

600 


530 


4900 


717 

4- 117 

13700 


1 4- 884 — 321 



these numbers forms an element of the statistical series. The 
total number of elements is 20 

The mean and dispersion of the series are computed in exactly 
the same *9180061 as for a^homograde series. We choose a provisional 
mean, here taken at 600 , write the numbers and and 
take the sums. In view of the small nuirber of elements it suffices 
to round the squares to hundreds. We then have, by the 
rules of the previous chapter. 


1* Notice that a datum that must not be missing when homograde 
statistical series are treated is missing here; the number of comparison 
(s). Indeed, when heterograde elements are discussed, a 'number of 
comparison' can never be given. 
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b = ^884 — 321) : 20 = + 28.2, 
M = 600 4- * = 628.2, 


a = 1/106 000 : 20 — (28.2)* = 67.i. 

Computing the mean errors by the formulae of Cap. IV, we have 

M = 628.2 ± 15.0, 

O = 67.1 10.fi. 

If the number N of elements were large, a division into 
classes would be advised. The process to be applied here, likewise 
the computation of the characteristics, is exactly the same as we 
have given in Cap. Ill for a homograde statistical series. The 
distinction between the two kinds of series lies elsewhere. 

53 Whereas in homograde statistics the grouping of the elements 
about the mean can be completely satisfactorily explained by the 
theorems of aERNOULLi, pqisson., and Lexis, the situation with 
heterograde statistical series is completely different. No 
mathematical argument, for example^ suffices to predict the mean 
deviation of temperatures in Lund for the month of 4iay, and* 
no a priori conclusions suffice to estimate, even approximate-p 
ly, the dispersion of a statistical series concerned with the 
height of adults (of a certain race). Not only is it not pos¬ 
sible to make such a priori estimates of the dispersion (or 
of other characteristics) of a heterograde series, but it even 
appears doubtful whether in general the elements of a series of 
heterograde individuals are grouped according to a normal law that 
can be described by simple mathematical formulae. 

Experience has however shewn that at least this last is the 
case. It has indeed been found that a statistical series of 
heterograde individuals possesses essentially the same properties 
as a similar series of homograde individuals. The principal dis¬ 
tinction is that whereas the characteristics of the latter series 
can be computed--to a better or worse approximation--from some 
simple theorems of the mathematical theory of probability, 
these theorems desert us completely for a series of heterograde 
individuals. 

We are here compelled to seek other means of explanation. The 
first step is to seek a sufficiently general hypothesis tcu explain 
how the deviation that an individual of the population shews with 
respect to the attribute to be investigated may be composed. The 
hypothesis upon which I have built my investigations is that each 
individual deviation may be considered as the sum of a set of un- 
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known but small quantities, which are called elementary errors.^ 

Observe, for example, the height in a population of grown men. 
If it can be assvimed that all men of the population have complete¬ 
ly similar ancestry; that they have been exposed to exactly the 
same upbringing, food, climatic influences; that all other circum¬ 
stances, which might affect the height of man, have been identical 
for all men of the population: then we must conclude, as surely as 
an effect is determined by its cause, that the height of all these 
men would he the same. The differences in the heredity-^, upbring¬ 
ing, food, etc., may here be considered as so many sources of 
error with respect to the height of these men. Each source of er¬ 
ror causes a positive or negative elementary error in the height. 
The resulting deviation of the height of an individual from the 
ideal height that they would all have, if they had been exposed to 
the same influences, consists of the sum of all these small 
quantities. Obviously the number of sources of error must be con¬ 
sidered as very large or infinite. 

To derive from the hypothesis of elementary errors the laws 
that hold for the distribution of the elements in a heterograde 
statistical series is an assignment in the field of mathematical 
probability theory. To give without mathematical arguments a pre¬ 
sentation of the train of thought that can he followed in this 
connexion is by no means easy; I do not intend here to make a such 
attempt. I confine myself to pointing out that Laplace has first 
shewn how such problems may be solved, although he did not carry 
the analysis as far as is required for the derivation of the 
general laws.^ 

Some preliminary remarks are in order about the terminology 
here used. We assume that the elements are arranged in classes. 
Let X be the class mark and y the nurrber of elements belonging to 
the class with the mark x. It is then possible to derive from the 
matl^ematical theory a formula, or mathematical expression, of such 
form in x and y, that y can be computed analytically for any value 
of X. This expression is called a frequency function. If x and y 
are graphically presented, plotting x as the abscissa and y as the 
ordinate, the result is known as a frequency curve. 

It can be shewn that these frequency curves (frequency func¬ 
tions) occur in two different forms, which I call frequency curves 
(frequency functions) of Type A and of Type B. 


2. The hypothesis as such is not new, but was used as early as 1837 

by Hagen, in a somewhat more special form, to derive the Gaussian law 
of error.^ ^11 , p. 29 ^ 

3. 'Heredity* of course here includes an infinite class of sources 
of error; any attempt to classify these more closely is here superfluous. 

4. See <2>, <3>. 
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I shall explain some properties of these types in Capp. XI. 
and XII. If the higher characteristics are small, the frequency 
curves of Type A approach the Gaussian curve of error, which will 
be studied in the followin<? chapter under the name normal curve. 


Cap. X. The normal frequency Curve. 


We found in the third chapter that the dispersion a can serve 
to illuminate the manner in which the elements of a statistical 
series are grouped about the mean M. We made hereon the observe* 
tions that the number of elements lying between the limits Hhcr and 
M-cr is about two thirds of the total number, and further that all 
the elements generally lie between the bounds Af - 3cr and M + 3cr. 
Therefore the elements obviously lie more densely in the neigh¬ 
bourhood of the mean than elsewhere. We can conclude from the 
second of our observations that the number of elements declines 
rapidly when one goes away from the mean. 

If the higher characteristics are small, as is assumed 
throughout this chapter, we can predict much more exactly how the 
elements of the series are distributed about the mean. 

We shall always assume that the elements are arranged in 
classes. Let w be the class interval, x the class mark, and y the 
number of elements belonging to the class x; then the normal 
distribution of elements follows the simple law 


0 ) 


Nw ^ 
aY2ji 


(X —»)=»«>» 
2 


By this fonnulai the theoretic number, y, of elements belong¬ 
ing to the class x, can be calculated. 

In graphic portrayal of the normal curve it is useful so to 
choose the units (the scale) of the x- and /-coordinates, that all 
normal curves are directly comparable. This is accomplished by in¬ 
troducing normal coordinates X and K, defined by the equations ^ 


( 2 ) 


X = ix — b)w 

a 


Y = 


a 


1. Here bw = - M^, so that b (but not a) is expressed in class 

interval units. 

2. Thus the unit of X-coordinates is the dispersion. 
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so that the equation of the normal curve takes the form 

r „ ,r\ ‘ 

(3) r=n{X)=y=C 

Ttie function qPo(^) is called the probability function. Its 
value is given in the following table to three places of decimals. 


Table 22. 

Tha probability function. 


X 

9>(X) 

X 


3 

Vo{X) 

X 

9>o{X) 

O.o 

0.899 

1.0 

0.342 

2.0 

0.054 

3.0 

0.004 

0.1 

0.397 

1.1 

0.218 

2.1 

0.044 

3.1 

0.008 

0.2 

0.391 

1.2 

0.194 

2.2 

0.0 S 6 

3.2 

0.002 

o .« 

0..381 

1.8 

0.171 

2.8 

0.028 

3.8 

0.002 

0.4 

0.368 

1.4 

0.150 

2.4 

0.022 

3.4 

O.ooi 

0.5 

0.352 

1.5 

O.ico 

1 

2.5 

0.018 

3.5 

O.ool 

0.6 

0.838 

1.6 

0.111 

2.6 

0.014 

3.6 


0.7 

0.312 

1.7 

0.094 

2.7 

O.OlO 


j O.ooi 

0.8 

0.290 

1.8 

0.079 

2.8 

0.008 


1 

0.9 

0.266 

1.9 

0.066 

2.9 

0.006 




If we wish to compare an observed statistical series with the 
normal curve, we must--after calculating the characteristics of 
the series--derive the values of the normal coordinates by means 
of formulae (2), where the observed number of elements belonging 
to the class x is to be substituted for y. If, moreover, we use 
Table 22 to construct the normal curve, the computation of normal 
coordinates and their imposition on the diagram is the work of a 
few minutes. I advise my reader to carry out such a construction 
himself; the work involved is simple and moreover gives a concrete 
picture of the manner in which the elements of a statistical 
series are distributed. Here, as with statistical calculations in 
general, the arithmetic is best done with a computing machine. 

I As first example I shall treat a homograde statistical series. 
I choose for this purpose the series that has been used several 
times already in illustration of statistical theorems; the ntinber 
of boys per 500 births in Sweden (for different months and pro¬ 
vinces). The series is arranged in classes in Table 5 ; data 
derived therefrom are given in the first two columns of Table 
23 . We find the computation of b and cr elsewhere. The normal 
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^0 coordinates for each class are now computed by means of formulae 
(2), and the corresponding points shewn on the diagram by small 

circles. * ^ 

As rig, I shews, the observations 

follow the normal curve, given by the 
Table 23 solid line, very closely. The largest 

discrepancy occurs at x = 0 , so that 
, , the number of boys in 500 births is. 

•r 0 ova found somewhat more often in the 

par 500 birtha. neighbourhood of the mean (257) than 

_ could be expected from the normal 

N 576, w frequency curve. Observe further that 

b = +0.0241V, = 2.4981V departures from the normal curve 

= + 0.12 =12.49. shew no definite systematic course. The 

departures may in general be considered 
- ■ ■ !! ' . !■ = accidental. The number of observations 

X F(x) X Y is not large enough to determine with 

certainty whether the frequency curve 

^ ^ for this statistical object depart from 

—11 1 —4.41 .004 ^ j 

—10 0 —4.01 .000 normal form. c u a 

cn - 9 0 -3.61 .000 As example of a heterograde 

_ g I 3^21 .004 statistical series I chose the periods 

— 7 2 —2.81 .009 of gestation in cows, according to 

— 6 5 —2.41 .022 observations in the agricultural 

Z 4 18 —l6i *078 institute at Alnarp, which have very 

_ g 4 *y j* 2 i !205 kindly been placed at my disposal. The 

— 2 W —0.81 .260 period is counted from the date of 

— 1 81 —0.41 .351 covering to the birth of the calf. The 

® mean value of the period was 278.15 

+ 1 91 +0.89 .395 j 

+ 2 60 +079 .260 days and the dispersion was 5.35 

+ 3 44 + 1.19 .191 days.^ The number of cases registered 

+ 4 22 + 1.59 .095 was 393 . 

+ 5 16 +1*®® .069 We find (from Table 24 and Fig. 2) 

+ 7 0 +279 000 that the number of elements in the 

g Q + 3.19 !000 different classes is again arranged in 

+ 9 1 +3.59 .004 accordance with the normal curve. The 

■ deviations are somewhat greater than in 

the previous example, doubtless because 
of the small number of elements. 
They have, however, an accidental character, and we are not in a 
position to decide from the material at hand whether, or in ^at 
way, the frequency curve for the period of gestation of cows devi¬ 
ate from the normal form. 

gl The more elements in the statistical series, the more uniform 
^■thc distribution of deviations from the normal form. In most 

4. With mean error added, M = 278.15 ± 0.27 , and cr = 5.85 ±0.19 • 


X 

F(x) 

X 

Y 

—11 

1 

— 4.41 

.004 

—10 

0 

— 4.01 

.000 

— 9 

0 

— 3.61 

.000 

— 8 

1 

— 3.21 

.004 

— 7 

2 

— 2.81 

.009 

— 6 

5 

— 2.41 

.022 

— 5 

13 

— 2.01 

.056 

— 4 

18 

— 1.61 

.078 

— 3 

47 

— 1.21 

.205 

— 2 

60 

— 0.81 

.260 

— 1 

81 

— 0.41 

.351 

0 

108 

— O.oi 

.468 

+ 1 

91 

+ 0.89 

.395 

+ 2 

60 

+ 0.79 

.260 

+ 3 

44 

+ 1.19 

.191 

+ 4 

22 

+ 1.59 j 

.095 

+ 5 

16 

+ 1.99 

.069 

+ 6 

6 

+ 2.89 

.026 

+ 7 

0 

+ 2.79 

.000 

+ 8 

0 

+ 3.19 1 

.000 

+ 9 

1 

+ 3.59 

.004 
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cases, again, the correspondence with the normal curve becomes 
better the more numerous the elements are. This, of course, is 


Number of boys per 500 births. 
Fig. 1. 

Y 



particularly true where there are 
grounds to assume that the phaenomenon 
in question follows the simple prob¬ 
ability laws of Bcrnoulli. Taking, for 
example, the series in Table 10 with 
1000 elements (number of black cards in 
10 drawings), we get the diagram 
Fig. 3 . 

The agreement with the normal curve 
is here practically complete. If, 
instead of this, one were to take the 
series of Table 9 (number of black 
cards in 50 drawings), which contains 
only 200 elements, one would have to 
expect significantly greater deviations 
from the normal curve. Obviously these 
deviations must here be considered 
totally accidental. 


Table 24. 


Pariod of gaotation in cowa. 


AT 393, =: 2 days, 

Afo = 277.5, = 2.67611', 

b = -f 0.688. 


X 

F(x) 

X 

Y 

— 8 

2 

— 3.11 

.014 

— 7 

2 

— 2.74 

.014 

— 6 

7 

— 2.87 

.048 

— 5 

9 

— 1.99 

.061 

— 4 

10 

— 1.62 

.068 

— 3 

19 

— 1.25 

. 129 

— 2 

37 

— 0.87 

.252 

_ 1 

52 

— 0.50 

.354 

0 

69 

— 0.12 

.470 

4- 1 

54 

4 0.25 

.368 

+ 2 

58 

4- 0.62 

.395 

+ 3 

30 

4“ 1.00 

.204 

+ 4 

25 

4 1.37 

. 170 

4-5 

10 

4- 1.74 

.068 

-46 

3 

4- 2.12 

.020 

4-7 

5 

4*2.49 

.034 

4*8 

1 

4- 2.87 

.007 
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Perioda of gaatation in cowa. 


Fig. 2, 

Y 



Number of black carda in 10 drawinga • 
Fig. 3. 

Y 
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Cap. XI. Frequency Curves of Type A. 


Table 25. 


X 

^0 

<h 

<f>4 

-3.5 

+0.0009 

+0.0283 

+0.0694 

-3.0 

+0.0044 

+0.0798 

+0.1330 

-2.5 

+0.0175 

+0.1424 

+0.0800 

-2.0 

+0.0540 

+0.1080 

-0.2700 

-1.5 

+0.1295 

-0.1457 

-0.7043 

-1.0 

+0.2420 

-0.4839 

-0.4839 

-0.5 

+0.3521 

-0.4841 

+0.5501 

0 

+0.3989 

0.0000 

+1.1968 

+0.5 

+0.3521 

+0.4841 

+0.5501 

+1.0 

+0.2420 

+0.4839 

-0.4839 

+1.5 

+0.1295 

+0.1457 

-0.7043 

+2.0 

+0.0540 

-0.1080 

-0.2700 

+2.5 

+0.0175 

-0.1424 

+0.0800 

+3.0 

+0.0044 

-0.0798 

+0.1330 

+3.5 

+0.0009 

-0.0283 

+0.0694 


62 preceding chapter we have attempted to shew by some ex¬ 

amples how statistical series of heterograde or homograde elements 
approach more or less a certain form of frequency distribution; 
which form, graphically portrayed, is known as the normal curve. 
We have further seen that departures from this form occur, which 
have in many cases a purely accidental character and vary from one 
series to another. In other cases, particularly in statistical 
series that contain a large number of elements (say over 1000), 
however, these departures display a regular and systematic charac¬ 
ter, pointing to frequency curves which have a form differing from 
the 'normal'. I shall give in this chapter some properties of 
these curves. 

It is again assumed that the elements are arranged in classes. 
It is further assumed that the normal coordinates X aod Y are in¬ 
troduced by formulae (2) of the preceding chapter in place of the 
class mark x and the frequency F(x). 

The general equation for frequency curves of Type A is then 


(>) >' = 9*0 + ft + ft 9-4 + ft 9>, + • • 

1. <The /3., etc., here are not the same as the frequentlV used 

d-statistics introduced by PEARSON. --Tr.> 
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02 »vhere % , or, in full, VqC^)* <l«signate 8 as before the probability 
function and cp^ t CP 4 , etc., represent the derivatives of this 
function. 

I give here an abridged table of these functions, which will 
in many cases suffice for the graphic construction of a frequency 
curve. 

Multiplying the numbers of the third column by and those of 
the fourth column by and adding these products to the numbers 
of the second column, we obtain the theoretic K-coordinates. 
Joining by a line the points obtained in this wise, we have the 
theoretic frequency curve. The observed frequency curve is ob¬ 
tained by computing the normal coordinates for each class, 
according to formulae ( 2 ) of the preceding chapter. 

The diagram is adapted from that used at the Lund observatory. 
The printed curve is the normal curve. In the computations 
under the diagram the first four lines give the observed normal 
coordinates, computed for each class x by formulae ( 2 ) of the pre¬ 
ceding chapter; the five succeeding lines give the theoretic 
K-coordinates, which for simplicity’s sake are computed for X = 
-3.5 , -3.0 , -2.5 , etc. 

If the characteristics and /34 are not known, the normal 
curve and the first four lines under the diagram suffice. 

63 The coefficients > 53 , / 34 , /? 5 , etc., may be considered together 
with M and tr as the characteristics of the given frequency curve. 
For practical reasons, however, it is advantageous somewhat to 
modify the definition of these characteristics (cf. the next 
section). 

The calculation of the coefficients , etc., from the 

given statistical series can be perform^ by elementary methods. 
I give on an adjacent page the complete schema with appropriate 
checks, as given in [ 4 ]. The whole computation takes, for a not 
particularly practised computer, about an hour. 

&\ When the number of elements in the statistical series is 
not exceptionally 1 arge^ it suffices in general to compute the 
coefficients P 3 and P 4 . The characteristics which these two 
coefficients determine are called the skewness S and the excess E, 
and are defined by the relations 

S = 3 

t, — 3 ^4 


The skewness S--also called the coefficient of asymmetry-- 
gives, as the name suggests, a skew form to the frequency curve, 
so that the elements no longer, as with the normal curve, are 

2 

2> Edgeworth recommends the addition of the term >4^ <^g. When 

this term is used, the excess £ is to be replaced by ~ 3o “■ 

.-Tr.> ^^9 


S 8 



Stattotieal ob}aei: 

Schema for the Computation of Characteristics. 

Check. 



59 






















0L| synunistfica 1 ly distributed about the mean. When the skewness is 
positive there is a larger number (2SV/3) of elements greater than 
the mean than there is of elements smaller than the mean, and 


conversely when the skewness is negative. The highest point 
of the frequency curve, which corresponds to that value of the 
statistical quantity at which the elements are most dense (in the 
case of a heterograde series), no longer coincides with the mean, 
but is separated from it by the distance Scr , on the positive or 
negative side according to the sign of S. The arithmetic mean no 
longer gives the most probable value of the attribute; this coin¬ 
cides with the value corresponding to the highest point of the 
frequency curve. 

The excess £ does not influence the symmetry of the frequency 
curve, but alters the distribution, determined by the normal 
curve, of elements into different classes. If the excess is 
positive, the number of elements in the nei|i;hbourhood of the mean 
is greater than in a normal distribution. The frequency curve is 
elevated above the normal curve in the centre (therefore, in the 

neighbourhood of the mean), whence the 
name excess. The definition of excess is 
so chosen that this elevation is equal 
to the product of E by the height of 
the normal curve. 


Table 26. 

Width of broiMR beans. 


Af = 12000, w = 0.26 mm. 
Mo = 8 . 826 . 


X 

F<x) 

X 

Y 

-10 

3 

— 3.63 

.000 

- 9 

5 

— 3.12 

.001 

— 8 

24 

— 2.72 

.005 

— 7 

103 

— 2.82 

.021 

— 6 

239 

— 1.91 

.049 

— 5 

624 

— 1.51 

. 129 

— 4 

1187 

— 1.11 

.246 

- 3 

1650 

— 0.70 

.341 

- 2 

1883 

— 0.80 

.389 

— 1 

1930 

-fO .10 

.399 

0 

1638 

-i-O.60 

.339 

•f 1 

1130 

4-0.91 

.234 

-f 2 

737 

“b l* 8 i 

. 152 

-h 3 

427 

-b 1.71 

.087 

■f 4 

221 

+ 2.12 

.046 

+ 5 

110 

+ 2.62 

.023 

4* 6 

57 

+ 2.92 

.012 

+ 7 

24 

+ 3.88 

.005 

-h 8 

1 ^ 

+ 3.78 

.001 

+ 9 

1 2 

4" 4.18 

.000 


The skewness and excess together 
give very different forms to frequency 
curves, which recall the normal 
curve more or less, so long as the 
characteristics 5 and E are small.^ I 
shall give in the next section some 
examples of such frequency curves. 

As first example I choose a series 
of 12 000 measurements of the width 
of a kind of brown bean (Phaseoltts 
vulgar is), which Professor Johannsen^ 
in Copenhagen has very kindly placed at 
my disposal. The measurements were made 
in his biologic laboratory. 

Computing first the mean and 
dispersion by the methods of Capp. II. 

3. ^When a frequency curve is drawn fron 
0^ and 04 solely, one or both tails will dip 
below the X-axis, unless 03 and 04 lie with¬ 
in the limits given by Appendix Table 
44 . --Tr.> 

4. I take here the opportunity of warmly 
recommending the excellent picture of bio¬ 
logical statistics that this distinguished 
investigator has given ^13^. 


6o 



Statiitical object: 



6i 


X — 3 ^ —3 —Zb —2 —1.5 I —1 —0^ ±0 -fOJ^ -f 1 -hU +2 | + 2 ^ -f 3 4 - 3 ^ 
^0 t.0009 +.0044+.0175 +.0540+. 12951+.2420 +.3521 +.39891+.3521 +.24201+. 1295 +.054o|+.Ol75 +.0044 +.0009 












































































05Ill.f we nave 


b = —1^, a — + 2,482 


so that 


M = 8.512 mm, a — 0 .e 2 o mm. 

By means of the formulae (2) of the preceding chapter the 
values of the normal coordinates were computed. They are found in 
Table 26 . 

The corresponding points of the observed frequency curve are 
given in the diagram by small circles. 


Width of hemnm phaseolus vulgaris. 

Fig. 4. 


+.6 


4.4 


4.2 


+.6 


+ 4 


+.2 


— 4 —3 


— I 


4 1 4 2 4 3 4 4 


One might at first glance be inclined to consider the corres¬ 
pondence with the normal curve as extremely good. A closer 
observation (which is indeed made somewhat difficult by that the 
frequency curve has been reproduced here in very small format) 
shews, however, that the accidental deviations may indeed be con¬ 
sidered vanishingly small, but that there obviously exists a 
systematic deviation of the observations from the normal curve. 
The observed F-coordinates (I am here following the curve from 
left to right) are at first smaller than the K-coordinates of the 
normal curve, then from X * -1.5 to X ® +0.3 somevhat larger, then 
from X ® +0.3 to X ® +2.0 smaller again, and finally*for higher X- 
values larger again. 

We find that the observed points determine very exactly a 
curve that differs throughout from the norlhal. In accord with our 
earlier definitions of S and E we may say without further computa¬ 
tion that the observed frequency curve obviously has a small 
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negative skewness and a small positive excess. 

This is confirmed by the numerical values of S and £. Indeed 
one finds 


S = —0.I4I, E = 4-^*®**‘ ^ 

The peak of the frequency curve lies therefore 0.14 in xl\e 
negative direction from the mear\ and the number of elements in 
the immediate neighbourhood of the mean is as 103 : 100 to that in 
a normal distribution. 

I have on an earlier occasion expounded the need to know the 
uncertainty of the*obtained values of the mean and dispersion. 
This need is possibly even greater for the skewness and excess. 

For frequency curves that differ only insignificantly from the 
normal, the mean errors of 5 and E are given by the formulae 


'3) 


In 

adding 


e{S) = 


1.2247 

VW' 


the 

the 


e(E} = 


0.6124 

vw 


present example, then, where N = 
mean errors. 


12 000 


we have, 


S == —0.141 ± O.on, 
£ = 4.0.030 4 O.006. 


We may hence conclude that the values that we have found 
for the skewness and excess cannot here be considered accidental 
fluctuations from a normal frequency curve, but that the frequency 
curve for the width of these brown beans indeed deviates from the 
normal form in a manner that is characterised by the given values 
of 5 and £. 

Drawing the theoretic frequency curve from the values of S and 
E above (applying Table 25), one finds such good agreement with the 
observed values that the deviations are not noticeable on the 
scale used in Fig. 4 . 

In Anthropolo^ia suecica [ 21 ] Professors Retzius and Furst 
have studied the Swedish recruits of the years 1897 and 1898 from 
the point of vi^w of statistical anthropology. From their work I 
extract the following figures, which relate to the cephalic index 
for 22 505 Swedish recruits of the year 1897 . 

5. <Eg = 0.013 . --Tr.> 
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Table 27. 


67 


CephaliQ index 
of Swedish recruits. 


The class interval w is chosen at 
2 , the provisional mean Mo taken at 
77.5 . From the observed frequencies 
F(x) we have 


N =: 22505, w 2, Mo 77.6. 


X 

F(x) 

X 

Y 

— 6 

\2 

— 3.42 

.001 

— 5 

87 

— 2.77 

.006 

-4 

510 

— 2.12 

.035 

— 3 

1952 

— 1.48 

.134 

— 2 

4346 

-0.88 

.298 

— 1 

6039 

— 0.18 

.414 

0 

5050 

-f 0.47 

.346 

4* 1 

2822 

4" 1.11 

.194 

+ 2 

1172 

4- 1.76 

.080 

. + 3 

377 

4- 2.41 

.027 

4-4 

94 

4“ 3.06 

.006 

• 4-5 

31 

4- 3.71 

.002 

4-6 

13 

4- 4.86 

.001 


b == -0.721, a = + 1.544. 


expressed in class interval units. 

Therefore (returning to the index 
as unit). 


M = 77.5- 1.442 = 76.058 ± O.02I, 


a — 3.088 ± 0 . 015 . 

From b, w, and a, we compute the 
normal coordinates given in the third 
and fourth columns of the adjacent 
table. The corresponding points are 
given in Fig. 5 and compared with the 
normal curve. Considerations analogous 


Cephalic index of recruits* 
Fig. 5. 

Y 



to those of the previous example lead us to the conclusion 
that here again the frequency curve must have a negative skew¬ 
ness and a positive excess. Indeed numerical computation of 
the characteristics gives 
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S = -0.121 ^ 0.008, 

E = 0.045 i 0.004. ® 
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Drawing the theoretic frequency curve from these values, one 
finds practically complete agreement with the observations. 

The examples given shew that the form of the frequency curve 
is essentially connect^fl with the number of elements in the given 
statistical series. A statistical object that for N, say, = 1000 , 
gives rise to a nearly normal frequency curve, may give for a 
smaller A', say, 200 , a frequency curve with notable skewness and 
excess. Obviously these values of skewness and excess, obtained 
from series with small values of /V, express no essential property 
of the statistical object treated. 

We realise from this the necessity of giving, when computing 
S and £, their mean error. Only by this means is it possible to 
decide whether the deviation of a given frequency curve from the 
normal form he accidental. We find from (3) that the mean errors 
of S and E (if the ‘true* frequency curve differ little from the 
normal) depend only upon A and therefore are easily conputed. As a 
practical rule (to which, however, important exceptions occur) it 
may be enunciated that it is not vjorth while in general to compute 
the higher characteristics of a statistical series, unless the 
number N of elements he greater than 1000 . It is, however, always 
the mean error that tells us how far the values of skewness or 
excess are trxistworthy. 

A question which may he regarded as scarcely yet opened is 
that of how to explain departures of a frequency curv» from the 
normal form. It is obvious that the interpretation is fundamental¬ 
ly connected with the nature of elementary sources of error. The 
difficulty, however, lies in that the solution of the problem is 
not unique, that even infinitely many solutions exist. TTierefore it 
is necessary to classify these infinitely many solutions in a 
rational manner; but such a classification has not yet been 
a tt empted, 

If, retaining the terminologs' of Quetelet, we call a statis¬ 
tical object characterised by a normal frequency curve a type, it 
can in general be said that a statistical object with non-normal 
frequency curve (with definite skewness, excess, etc.) may be con¬ 
sidered composed of several types. 

Assume for example that the height of 14-year-old schoolboys 
in Sweden forms a certain type and that 15-year-old schoolboys 
form another. Should a statistical object be now constructed by 
choosing at random 1000 15- and 1000 14-year-old schoolboys, the 
frequency curve for these 2000 individuals would no longer be 
normal. To be sure the skewness Mould be zero, but the curve would 


6. <Eg = 0.033 . .-Tr.> 
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a negative excess. If the number of individuals from the 
two classes were unequal, a more or less notable skewness would 
also result. 

Or, to take another example, assume that the 1000 14*year*old 
schoolboys had the same mean height as 1000 15-year-old school¬ 
girls chosen at random. Should one investigate the frequency curve 
for these 2000 individuals, one would find that this curve 
possessed positive excess and zero skewness. 

Each statistical object that is a combination of a rather 
large number of types will thus in general possess a frequency 
curve of other than normal form. It will be understood from this 
that the normal curve (the Gaussian curve) must be not the rule, 
but the exception, in statistics. If, however, frequency curves 
with a large number of elements appear to depart only insignifi¬ 
cantly from the normal form, it can by no means be concluded that 
the statistical objects consist of only a small number of types. 
Rather must we in general explain this behaviour by the fact that 
the number of types is very large. 

If for any reason we know that a certain frequency curve con¬ 
sists of the sum of two normal curves (types), we can then compute 
from the characteristics of the given f requency curve the mean and 
dispersion of these subordinate types The solution of this prob¬ 
lem was given by Pearson L17, No. 2J. In exceptional cases the 
solution can also be found when the statistical object is com¬ 
pounded of three or more types. 


Cap. XII. Frequency Curves of Type B. 


70 Whenever the composition of a frequency curve can be explained 
by a summation of elementary errors, the resulting frequency curve 
must belong either to the ’Type A’ discussed in the preceding 
chapter, or to that other form which I have named the frequency 
curve of Type B and which I shall briefly consider in this 
chapter. 

The equation for a frequency curve of Type B is given, like 
that of Type A, by an infinite series of terms. Hie first of these 
was already exhibited by Poisson, and was later applied by 
Bortkiewicz to certain statistical problems. The general form of 
the equation of these curves, derived from the hypothesis of 
elementary errors, was first given, by me in my paper ‘Uber die 
zweite Form des Fehlergestzes’ [s] . 

For the genesis of these curves I must refer to my more com¬ 
plete account. Let it only be mentioned that Type B occurs 
Principally when the attribute is discontinuously bounded at one 
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of its values* so that on one side of this bound no individuals 
possessing the attribute occur, whereas on the other side of the 
bound and in its immediate neighbourhood the property occurs in a 
large number of individuals of the population. 

If, for exanple, one wished to investigate the frequency curve 
for the number of voters at Swedish elections under the old con¬ 
ditions, by which no-one wi tYi an income under 800 Kronen had a 
vote, one would obviously have an attribute (the right to vote) 
that was discontinuously bounded. 

There were no voters with incomes under 800 Kronen, but immed¬ 
iately over this figure, ex.gr. from 800 to 900 Kronen, there were 
many voters, who, because of this condition, belonged to Type B, 
when arranged according to income. 

It is well to observe that the two types are not sharply dis¬ 
tinguished from each other, but that, in general, transition forms 
exist, which, at least in practice, may be computed ad lib. as of 
one or the other type indifferently. 

I In homograde statistics the size of the element is always dis¬ 
continuously bounded by the value zero. There can exist no element 
to which belongs less than no individual. Therefore, in general, 
homograde series of elements comprising a very small number of in¬ 
dividuals have frequency curves of Type 3. Thus, with respect to 
homograde statistical series. Type B may be named the frequency 
curve of rare events. 

For example, the frequency curve for the number of stormy days 
in a month in Sweden (v. Art. 73 infra) is of Type 3, whereas the 
number of rainy days, which are not so rare as stormy days, yields 
a frequency curve of Type A. 

\ The series that represents the equation of a frequency curve 
*of Type A has, as we saw in Art. 62 , the derivates of the normal 
function cpp as terms. In like manner the equation of a frequency 
curve of Type 3 may be represented as a series, which, however, 
has as terms the differences of a function(x). This function is 
given by the formula 


—X X 

(1) ¥'(*) = 

where X is the first characteristic of the statistical series. It 
can be chosen in different ways; I shall discuss only one here: 
that \ coincides with the mean of the x-values. We have thenl 

(2) I = M{x) = :SxF(x):N. 

1. Since this determination of A. is not necessary, 1 have Riven a 
proper name to the characteristic A.: the modulus of the statistical 
series. 
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If, as usual, we designate by F(x) the number of individuals 
in the class x, we have 

(3) F{x) — N [v(x) + Yi + y, A*if} + y* + •••]. 

where N denotes the whole number of 'individuals and 72 * *^3» '>' 4 » 

etc., certain constants, which here, together with X., are the 

characteristics of the statistical series. etc., denote 

the finite differences of calculated from the formula 

( 4 ) = ^{x) — — 1 ) 

I must omit here the formulae for the calculation of 73 , 7 ., 
etc., and exhibit only the expression for 72 fa quantity I call 
the eccentricity) because its value is generally required in 
practical applications. We have 

(5) Yt = Sx* F(x) : 2 JV — V, >1* — Vs x. 

The dispersion (a) of a series of Type B can be computed from 

the formula 

(6) i + 2 7, 


73 


7 ^ 


which can also be applied to compute 72 when K and cr are known. 

The value of the function 0 (x) can easily be computed from the 
equation (1).^ The fojlowing diagram gives a graphic representa¬ 
tion of the fiinction for different values of 

By means of this figure we can form a picture of the appearance 
of frequency curves of Type B. For smalt values of the modulus A- 
the frequency is greatest for x = 0 and becomes smaller the larger 
X gets. If the modulus X = 1 , the frequency is as large for x = 0 
as for X = 1 . For larger values of X the frequency first in¬ 
creases and reaches a maximum for x = X-1 , and then (for 
decreases. For large values of X the curves approach those of 
Type A.^ 

I shall confine myself to a single example for frequency curves 
of Type B. 

The number of stormy days in Lund in different months during 
the years 1753-1857 is given in the following table. 


2. ^ short table of 'p(x) is given in Appendix Table 43 . --Tr.> 


3 . Formula (3) presupposes that x takes or. the values 0 , •♦'I , '♦'2 , 
etc. The value x = 0 corresponds to the bounding value of the property, so 
that there are no individuals to be found corresponding to negative values 
of X. It must be noted that vanishes for all integral negative 
va lues of X. 
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Frequency curves of Type B. 


Fig. 6. 


y = I ix)- 



01 23456789 10 

This table is to indicate that in the 105 years here con¬ 
sidered, no storms occurred in January, February, and December. In 
March there was no storm in 101 of the 105 years and in four years 
one stormy day, etc. 

The values of the characteristics, computed by (2) and (5), 
are to be seen in the adjacent table. 

Here \ gives the average number of stormy days during a month. 
The influence of the eccentricity ^2 later summer months is 

significant. In order to give a picture of the influence of the 
second term in (3) for large values of I adduce here the re¬ 

sult for the month of August, compared with the observations. 
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Table 28. 


Nunibir (x) of otormy doyo in Lund In the yooro 1753-1857 . 


N = 105. 



Table 29. 


Month 

X 

rt 

January.... 

0.00 

4- 0.000 

February,.. 

O.00 

+ 0.000 

March.. 

0.04 

-f 0.000 

April....... 

0.28 

-h 0.020 

May. 

0.88 

-h 0.047 

June. 

iM 

4- 0.027 

July. 

2.52 

4- 0,781 

August. 

2.18 

+ 1.002 

September. 

0.78 

+ 0.246 

October... « 

0.12 

+ O.002 

November.. « 

0.08 

+ O.000 

December.. 

0.00 

+ O.000 


The correspondence between the 
theoretic values (in the paenultim- 
ate column) and the observed values 
is very good. The first term in 
formula (3), taken alone, would here 
have given a very incomplete picture 
of the distribution of stormy days, 
as is seen from comparison of the 
fifth column with the last. 
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Table 30. 


Nunibar of itonny daya in Angnat. 

I = 2^88, y, = + 1.002. 


X 


J ip 


N4* 

yi J* ^ 

F 

Frequency 

observed 

0 

4* 0.118 

+ 0.118 

4- 0.118 

+ 12.4 

+ 12.4 

+ 24.H 

24 

1 

4-0.253 

4-0.184 

4- O.016 

4 26ji 

+ 1.7 

+ 28.2 

20 

2 

4-0.268 

4- 0.017 

— 0.117 

+ 28.2 

— 12.8 

+ 15.8 

19 

3 

-f 0.181 

— 0.078 

— 0.085 

4- 20.1 

— 9.8 

+ 10.2 

13 

4 

-h 0.102 

— 0.088 

— 0.011 

+ 10.7 

— 1.2 

"1“ 0.6 

9 

5 

4-0.044 

— 0.058 

-f 0.081 

+ 4uj 

+ 3.8 

+ 7.9 

6 

6 

4* 0.016 

— 0.028 

4-0.029 

+ 1.6 

4* 3.0 

+ 4.6 

5 

7 

4-0.006 

— O.OlO 

4- O.018 

+ 0.5 

+ 2.0 

+ 2J5 

2 

8 

4-0.001 

— 0.004 

4-0.006 

+ 0.1 

+ 0.6 

+ 0.1 

0 

9 

4-0.000 

— 0.001 

4-0.008 

4- 0.0 1 

+ 0.8 

+ 0.8 

0 

10 

4-O.qoo 

4-0.000 

4- 0.001 

4“ 0.0 j 

+ 0.1 

+ 0.1 

0 

11 

4-0.000 

4-0.000 1 

4-0.000 

+ 0.0 1 

4* 0.0 

-i: 04) 1 

1 


Cap. XIII. On Correlation. 


One of the most inportant tasks of practical statistics is the 
attempt to determine whether, and in what degree, different 
statistical events are dependent upon one another. The problem is 
as a rule very difficult because of ‘accidental* fluctuations in the 
elements of the statistical series, which more or less strongly 
mask the dependency of the series. 

For a long time the graphic method has been used to investi¬ 
gate the connexion between statistical series. One draws a curve 
that shews how the elements of a statistical series vary, and on 
the same paper another curve that portrays the variation of the 
elements of a second simultaneously observed statistical series. 
Then, by conparative observation of these curves, one can, at 
least in many cases, decide whether there is a connexion between 
the everits. 

The method is admirable and very necessary tor the first recog¬ 
nition of connexion. It possesses, however, two inportant faults; 
firstly the accidental errors may be so large that it is difficult 
or even impossible to decide by graphic comparison whether a con¬ 
nexion between the events exist; secondly--and this is the most 
notable shortcoming of the method--the degree of the connexion be¬ 
tween the events cannot be determined in this way. 
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In the great upswing of mathematical statistics that began at 
the end of the last century, attention was also turned to this 
problem, leading to the discovery of new and more reliable methods 
of determining the connexion between statistical events. The fund¬ 
amental discovery in this field was made by Sir Francis Galton and 
is set forth in his work, notable also in other respects. Natural 
Inheritance [lO] . 

The connexion between statistical events is denoted by the ex¬ 
pression correlation. From the standpoint of elementary errors, 
two events are called correlated if they, wholly or partly, are 
due to the same elementary errors- 

Strictly speaking it can obviously be said that all events in 
the mrld are correlated with one another. In most cases, however, 
the correlation is so weak that in practice the events may be con¬ 
sidered independent. The strength of correlation is measured by 
the coefficient of correlation.^ I shall discuss in the sequel the 
numerical corrputation of this coefficient and at the same time pre¬ 
sent some of its most important properties. 

The method of computation is somewhat different, accordingly 
as the material is or is not arranged in classes. Ihe latter case, 
comprising series of small extent, will be treated in this 
chapter. 

We assume that two simultaneously observed statistical series 
and with the elements 


niff • • •! 

lllf ^Sf • • •> 

are at hand, where the elements are supposed to correspond by 
pairs (mj and m 2 and n 2 f etc.). Let the mean and dispersion 

of be Ml and cri; those of S 2 , M 2 and 03 . To compute these 

characteristics as well as to compute the coefficient of correla¬ 
tion a provisional mean is chosen for each series, and denoted by 
Miq and Ifjo* Let r denote the coefficient of correlation. Putting 

(I) Xt= — yu = — 

where k = 1 , 2 , 3 , . - - , Af, we have the following formulae for 
the computation of ifj, c-j, Af 2 » 


(2) bi = SXt’.N, bt = :£yt:N, 


1. More generally, the correlation is given by an infinite series of 
characteristics, of which the first--and in general most important--is the 
coefficient of correlation. I cannot go into the higher characteristics of 
correlation in this resume. <See JffRGENSEN <12>. --Tr.^ 
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(3) 

(4) ff,* = 2x\: N — bi*, ffs* == 2y\ : N — ft,*, 

(5) ff, ff, r = 5x»y* :N — bt ft,. 
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l«^rinulae (1) to (4) are nothing but the already given formulae 
for computing the means and dispersions of the given series. The 
coefficient of correlation r is furnished by formula (5). 

I shall apply the formulae to a numerical example and shew how 
a simple check on the computation may simultaneously be imposed. 

The supply of drinking water in the city of Lund comes in part 
from five reservoirs in Rogle, 5 km from the city. Some time ago I 
was engaged by the city to investigate in what degree the water 
supply might be dependent upon meteorologic factors. I had, there¬ 
fore, as first task, to determine the correlation between the 
quantity of water that flowed into the reservoir in one year^ and 
the total rainfall during the same time. 

The investigation is based on the following figures, which were 
obtained in the years 1899-1908 . Let 

T = the whole inflow of water to the reservoir in a year, expressed 
in 1000 cbm. 


R = the total rainfall during the same time, expressed in mm. 

A glance at this table suffices to convince us that a connexion 
exists between the rainfall and the inflow of water. In dry years 
(1899 , 1902 , 1904) the flow into the reservoirs is small, like¬ 
wise in vears with much rain the inflow is in general large. The 


Table 31, 


Year 

T 

R 

1899 

258 

511 

1900 

708 

661 

1901 

426 

597 

1902 

304 

541 

1903 

762 

663 

1904 

cxn 

563 

1905 

562 

607 

1906 

422 

576 

1907 

521 

530 

1908 

522 

719 


2. It is necessary to extend the investigation to the different months 
of the year. For the sake of brevity I restrict mys***f to the results f o • 
different years. 
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70.fluctuations, however, are not insignificant (cf.ex»gr» the years 
1907 and 1908), and it is necessary to determine to what extent 
the two events are mutually dependent. 

We take the provisional mean at 500 for T and 600 for /?, 
so that 

T = 500 + X, /? = 600 + y. 

Then the computation of the characteristics proceeds according 
to the schema of Table 32 . 


Table 32. 

Corrslstion bstwstn rslnfsll and influx of wmtor st Rogls. 
N = 10, ill,0 = = ®00. 


Year 

X 

y 

XX 

xy 

yy 

x + y 

\ 

+ y)* 

1899 

— 242 

— 89 

+ 58600 1 

+ 21500 

+ 7900 

-331 

109600 

1900 

+ 208 

+ 61 

+ 43300 

+ 12700 

+ 3700 

+ 269 

72400 

1901 

— 74 

— 3 

+ 5500 

+ 200 

0 

— 77 

5900 

1902 

— 196 

— 50 

+ 38400 

+ 11600 

+ 3500 

— 255 

65000 

1003 

+ 262 

+ 63 

+ 68600 

+ 16500 

+ 4000 

+ 325 

105600 

1904 

— 234 

— 37 

+ 54800 

+ 8700 

+ 1400 

— 271 

73400 

1905 

+ 62 

+ 7 

+ 3800 

+ 400 

0 

+ 69 

4800 

1906 

— 78 

— 24 

+ 6100 

+ 1900 

+ 600 

— 102 

10400 

1907 

+ 21 

— 70 

+ 400 

— 1500 

+ 4900 

— 49 

2400 

1908 

+ 22 

+ 119 

1 + 500 

+ 2600 

+ 14 200 

+ 141 

19900 


— 249 

- 32 

1+280000 



— 281 

469400 


The last two columns have been added as a check. F6r we have 


S(x + yy = 2x^ + Sy* + 2Sxy. 

The table gives 

Sx* = +280000 
Jy* = + 40200 

22xy = + 149200 _ 

+ 469400 = 5(x + y)*, 

whereby the entire computation is completely checked. 
From the numbers in the bottom line we now have 


== —249; 10 = —24.*; b, = —32:10 == —3.2, 
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a,»= 280000:10 —27380; 17 ,*= 40200:10 —V = 4010, 

10 that 


a, = 165 j, o, = 63.3. 


Further we obtain from (5) 


roTi a, = 74600:10 —= +7380, 


so that, with tha values obtained for o-| and 0 ^ 2 , 


r = +0.704 

is the coefficient of correlation. 

I What need prompts the computation of the coefficient of 
correlation? To explain this, we shall first elucidate some general 
properties of the coefficient. 

. Mathematical analysis shews that the numerical value of 
r is less than or equal to unity. Moreover, the coefficient of 
correlation may be positive or negative; if it is positive, then, 
in general, the two attributes considered increase together and 
decrease together. If r is negative, then one attribute increases 
when the other decreases, and vice versa. In the example con¬ 
sidered r was positive, and consequently the inflow of water to the 
reservoirs is in general greater in rainy years than in dry. This 
is also directlv seen from Table 31 . 

A coefficient of correlation of zero signifies that the observed 
statistical events are mutually independent and have nothing to do 
with each other. 

If r s (or -1), an element in one statistical series is 
completely determined. by the corresponding element in the other 
series. 

It is, then, to be understood that the larger the numerical 
value of r, the closer the dependency between the observed 
statistical series. For example r has been found * ♦O. 96 from 
simultaneous measurements of the right and left femur, **’0.80 
between the height and the length of femur, and only = -tO.S? be¬ 
tween the height and the length of forearm. The correlation r 
between the temperature in Lund in June and July ® ♦*0.734 ; that 
between June and Novenfcer « +0.093 • 

I The coefficient of correlation is not limited to furnishing a 
measure of the intensity of the connexion between statistical 
events. It also serves to solve the following important problem: 

Given the value of an element of one series, what is the most 
probable value of the corresponding element in the other series? 
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Let X designate the deviation of an element in the first series 
from its mean let y designate the deviation of the correspond¬ 
ing element in the second series from its mean M 2 * Denoting by X 
the most probable value of x corresponding to a given y, we have 


( 6 ) X = r^y. 

Contrariwise^ denoting by Y the most probable value of y cor¬ 
responding to a given x, we have 


( 7 ) 



Equations (6) and (7) represent geometrically two straight 
lines* known as the regression lines. The coefficients of y and of 
X in the right aides of equations (6) and (7) are called the 
regression coefficients. 

If or| cflTj, these equations assume the form X « ry and Y « rx. 
Since r is always nimerically smaller than unity, it follows from 
the above equations that in this case the most probable value of 
an element of the second series departs less from its mean than 
the given element departs from the mean of the first secies. Hence 
the name regression. 

We obtain as the equations for the regression lines in the ex¬ 
ample of Art. 78 , 

r —375 = \M (R — S97), 

R — 5Q7 = 0.27 ( 7 -^ 375 ), 


where the T-and i? on the left side denote the most probable values 
of inflow and rainfall corresponding respectively to given values 
of M and T. The first of these equations can be used to compute the 
most probable value of flow of water into the reservoirs from know¬ 
ledge of .the fluctuations of rainfall in Lund. 

As with other characteristics, it is again very necessary to 
calcualte the mean error of the coefficient of correlation; this is 
done }>y means of the formula 


( 8 ) 


e(r) 


1—r» 

KF ■ 


As in many other cases the mean error varies inversely as the 
square root of the number of observations. For constant if it is 
smaller, the larger r is. In our example we have 


e(r) = 


1 — (0.7M)* 

yio 


0 . 157 , 
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so that the value of r is to be written in the form 


r = + 0.704 Jt. 0.157 , 

83 .. As a suitable example to shew the great practical utility of 
the mathematical theory of correlation, I shall give the results 
of an investigation of the correlation between the temperatures in 
Lund for different months. The numbers are derived from observe* 
tions in Lund over the Years 1753-1857 


Table 33. 

Correlation between the teinpereturee in different monthe in Lund. 

N = iu6. 


M 

& 

Mon. 

fen. . 

Feb. 

March 

Apr. 

May 

June 

July 

Aug. 




Dec. 

- 2«.21 

ZHi 

Jan. 













- 2.19 

2.84 

Feb. 

-f 0.330 












“ 1 .19 

2.68 

March 

+ 0.284 

+ 0.580 




1 







f- 3.04 

1 .80 

Apr. 

+ 0.182 

+ 0.288 

+ 0.421 










f 8 .42 

1 .90 

May 

+ 0.181 

+ 0.282j 

+ 0.808 

+ 0.429 









fl2.99 

1 .99 

June 

+ 0.268 

+ 0.188 

+ 0.209 

+ 0.409 

+ 0.686 








h 15 .10 

1 .91 

July 

+ 0.241 

+ 0.060 

+ 0.146 

+ 0.861 

+ 0J>18 

+ 0.784 







f 14.65 

1 .77 

Aug. 

+ 0.184 

+ 0.284 

+ 0.178 

+ 0.806 

+ 0.498 

+ 0.689 

+ 0.696 






f-n .46 

1 .52 

Sent. 

+ 0.802 

+ 0.240 

+ 0.164 

+ 0.184 

+ 0.884 

+ 0.449 

+ 0.401 

1 

+ 





t 7.27 

1 .76 

Oct. 

+ O.067 

+ O.J09 

+ 0.167 

+ 0.221 

+ 0.268 

'+0.271 

+ 0.128 

+ 0.161 

+0.266 




f 2.69 

1 .78 

Nov. 

— 0.098 

—0.110 

+ 0.016 

+ 0.128 

+ 0.147 

+ 0.098 

+ 0.002 

+ 0.041 

+ 0.141 

+ 0.1O4 



- 0.19 

! 2 .81 

Dec. 

—0.166 

+ 0.181 

+ 0.070 

+ 0.109 

+ 0.281 

i + 0.189 

+ 0.148 

+ 0.281 

4-0.104 

+ 0.128 

+ 0.242 



Unfortunately I can not here go into a further discussion of 
this interesting series of correlations. 
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Cap. XIV. 

O™ Correlation between Series arranged in Classes. 


8 ^ 


If the number of elements is so large that a division into 
classes is indicated, the coefficient of correlation, like the 
other characteristics, can be computed with greater accuracy. The 
computational process is significantly abridged by division into 
classes. The form of computation is changed, although formulae (1) 
to (7) of the previous chapter remain valid. I shall carry out an 
explicit exanple of correlation between classified series, and em¬ 
phasise the points that may be of interest in the treatment of 
such a problem. 


Table 34. 


M s 330, iv, =s 4 cm, Wi = 10 cm. 




Length of the topmost branch 

2? 

cm 

1—4 

S-8 

0-12 

13-16 

17-20 

21—24 

25—26 

Height of the fin 

l-IO 

5 

3 






8 

11-ao 

7 

15 

1 

2 




25 

2i<-ao 

2 

21 

17 

2 




42 

ai~4o 


15 

37 

20 

3 



75 



1 

31 

28 

4 



62 

5f-fi0 


2 

8 

27 

12 

1 


50 

61-10 



2 

9 

24 

4 

2 

41 

71-60 



1 

4 

7 

6 

2 

20 

61-60 




1 


3 


4 

01—100 






1 

2 i 

3 


r 

14 

57 1 

97 

01 

50 

15 

6 

330 
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Correlation between the height of fir tre^s and the length of 
their topmost branch. 

In the summer of 1909 my daughters Essie and Sonja measured 
330 young firs (3 to 4 years old) growing on KSmpinge-Heide in 
Schonen, and likewise the length of their topmost branch. The 
trees were all shorter than 1 m; the branches all shorter than 


28 cm. 

The values obtained in these 330 pairs of measurements 
are not given here; instead they are collected into a so-called 
correlation table. The heights of the firs are grouped in classes 
with class interval (W 2 ) of 10 cm, and the corresponding lengths 
of branches in classes with class interval (wj) of 4 cm* Table 34 
gives a conspectus of the measurements* 

It is seen from this table that five firs had a height of 1 to 
10 cm and a top of 1 to 4 cm in length* It is further seen 
that three fir^, shorter than 10 cm, had tops of 5 to 8 cm in 
length, etc* 

For numerical computation, provisional means are first chosen 
for the two lengths. We take the class 13 to 16 cm as provisional 
mean for the tops* Ihe centre of this class, is 14*5 cm. For 

the height of the firs we choose the centre of the class 41 to 50 
cm as provisional mean; M 20 ® 45.5 cm. We number the classes of 
tops in order with the class marks -3 , -2 , -1 , 0 , ‘<‘1 , **’2 f 
'*'3 , so that the class 0 is that class containing the provisional 
mean We designate any of these class marks by x. Similarly we 

number the classes of height of firs and designate these marks by 
y* The computation of the characteristics now proceeds as in 
Table 35 * 


We may call each small quadrangle, characterised by a certain 
X and a certain y, a subclass* It is seen that in each of these 
subclasses three nunbers are entered. The number in large type* 
designated F (= frequency), gives the nuiiber of individuals within 
each subclass, and is the same as the corresponding number in 
Table 34 * Then the products of F by the class marks x and y are 
entered in small type* The numbers in the same column have the 
same x-value; the numbers in the same row have the same y-value. 
When the number of classes is not large, these products are ob¬ 
tained without the help of a computing machine. After all subclas¬ 
ses have been filled up in this manner, all numbers are added 
horizontally and vertically* Their sums are then multiplied, as 
the table shews, by x and y. Finally all nunbers of the last row 
and last column are summed, and the sums entered in the lower 
right corner* The seven sums obtained in this manner give the de¬ 
sired characteristics, as shall immediately be demonstrated. 

Noticing that the nunbers F indicate how often each conbina- 
tion of X and y occurs in the population at himd, we may directly 
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Tabl# 35 


Coni^atatloii of tho coolficiont of corrolotion. 
= 330, Af 10 .= 14U&, Af AO = 45.6, iti = 4 cm, Wt = 1( 



+ 1 

+ 2 

+ 3 

F, xF 
yF 

F, xF 
yF 

F, xF 
yF 



8-21 + 84 
-32+128 


25 - 52 +156 
- 75 + 225 


42 -65 +130 
— 84+168 



15-ao 37 -37 20 0 3+3 


■f 5 


£F,£xF 


AXt 

14 - 45 57 -114 97 -97 91 0 50 +S0 15 +30 0 +I8 330 -I55 + 644 

^yF *’-42 -112 ' -50 +31 ^ +78 +44 +20 ^ - 43 +1129 


6 

+ 12 

+ 18 

3 

t " 


+ 12 

1 

+ 2 

+ 5 

15 +30 


+ 44 


+ 60 


+ 88 


50 

+ 

2 

+ 

2 

+ 

50 

+ 

50 

41 

+ 

36 

+ 

72 

+ 

82 

+ 

164 

20 

+ 

24 

+ 

72 

4* 

60 

+ 

180 

4 

T 

6 

+ 

24 


+ 

16 

+ 

64 

3 

+ 

8 

+ 

40 



15 

+ 

75 


xSxFl ^,26 .1228 +97 
xZyF\ +*35 +224 +59 


apply formulae (2) to (5) of the previous chapter and obtain 

bi = —155:330 = —0,47o, = —43:330 = —0.i3o 

<r, K615:330 —V = + 1.283, a, = 1/1129:330 —V = 

flTi OmT 044 • 330 — btb» -4“ liAoi. 











































































Here b|> ^ 2 ^ and 0^2 expressed in class interval units, 
r is an abstract number and therefore independent of the units 
of the attributes observed. If it is desired to express the 
characteristics in centimeters, and must be miiltiplied by 
(= 4 cm); ^>2 ^2 t>y (= 10 cm). Adding biWj to b2W2 to 

i^20» obtain the means Ml and M 2 for the height of the firs and 
the length of {he tops. Carrying out this work, and adding the 
mean errors of M and <r from formulae (2) and (3) of Cap. IV, and 
the mean error of r from formula (8) of Cap. XIII, we have the 
values 


Ml = 12.e cm ± Ojb; M, = 44^ cm + 1 .<b, 
ffj == 5.13 cm ± 0.»; Of = 18.45 cm ± 0.», 

r = -f 0.as ± 0.0I8. 


I I now pass to consideration of the regression lines and return 
for this purpose to Table 34 . As already mentioned, this arrange¬ 
ment is known as a correlation table. A column or row is called an 
array by the English writers; I shall retain this name. More pre¬ 
cisely a vertical row is called a y-array and a horizontal row an 
x-array. That value of x or of y that is common to all the 
elements of the same array is ordinarily called its type. Thus the 
fourth horizontal row of Table 34 is an x-array of type 35.5 ; the 
first vertical row a y-array of type 2.5 . 

Each array is a statistical series and thus possesses a mean 
and a dispersion. The following interesting theorems hold for 
these characteristics: 

The means of the arrays lie on the regression lines. 

.The d ispersion of all x-arrays is the same and has the value 
kr I I -r ^; likewise all y-arrays have equal dispersion of value 

crlrrrr 


Let us now see how far these theorems are fulfilled in the 
present example. 

To simplify the treatment, the values of x and y will be taken 
in class interval units. We may then employ Table 35 directly. The 
equations for the regression lines, according to formulae (6) and 
(7) of the preceding chapter, are: 


(1) X-b, ^ r^{y-b,), 

(2) Y - b. === r ^{x—bi), 
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gy where, as before. X signifies the most probable value of x cor¬ 
responding to a given y, and Y the most probable value of y 
corresponding to a given x. Substituting the nunbers from Art. 85 
these equations become 

for given y: 0.47 = O.S73 (y -)-0.13); 

for given *; y-)-0.13 = l.iss (x-|- 0.47). 

These two regression lines are drawn in the following diagram 
'(Table 36)• upon a system of coordinates whose origin is in the 
point X « y ® 0 . 


Table 36. 

Rsgrssslon Itoss. 



Mean —3.a — 1.97—0.W+044 +lJ»+2.9e + 3j8 
Disp. 0.70 147 1.06 149 1,06 1.01 l46 
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The means of the different x- and y-arrays are also drawn on 
this system of coordinates. The means of the x-arrays are indicat¬ 
ed by small circles, the means of the y-arrays by crosses. 

It is seen that the means lie very near the regression lines. 
The departures are in conplete accord with the corresponding mean 
errors of the means; the calculation of these mean errors is here 
omitted. 

It must be remarked that the two lines of regression (1) and 
(2) would coincide if r were « 1 and the correlation were perfect. 
This follows directly from equations (1) and (2). In the present 
case the coefficient of correlation is reasonably large, so that 
the angle between the lines of regression is insignificant. 

Considering, further, the dispersions of the different arrays, 
the numbers in the last vertical row of Table 36 , which gave the 
values of dispersion in the x-arravs, shew that these values are 
indeed not identical, but variable like any other statistical 
event. The fluctuations are again in accord with the mean error, 
and taking the mean of all values of dispersion, we have a nunber 

0.73 

that accords well with the theoretic value 

y l — r* — 0.m. 

Similar remarks hold for the dispersions of the y-arrays* The 
mean of the dispersions observed is 


1 . 00 , 


and the theoretic value 


a, V \—r* — 1 . 043 . 

The regression line (1) enables us to compute the most prob¬ 
able value of the length of the top of a fir of given height. From 
the line (2) we can compute the most probable height of a fir whose 
top has a given length. This is not the place to investigate the 
degree to which these questions may be of biologic interest. 

Whenever one must investigate whether and to what extent two 
statistical events are mutually dependent, one must turn to the 
theory of correlation. The solution, however, is not always so 
easy to find as we have assumed in these chapters. Indeed the 
coefficient of correlation r gives in many casWs all available in- 
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88 formation as to this connexion. If, however, the task is complex, 
or the number of individuals in the population very large, so that 
greater exactitude is practicable and desirable in the treatment 
of the problem, the higher characteristics of the correlation 
function must also be determined. 

I shall not go more closely into this question here, but shall 
merely point out that the correlation function can in general be 
represented by an equation that is a direct generalisation of the 
equation we introduced in Cap. XI for frequency curves of Type A. 
The coefficients in this series may be obtained by elementary means 
analogous to those by which we obtain the skewness and excess of a 
statistical series. 

on The relation connecting a certain given value of one attribute 
with the most probable value of the other takes on a more complic¬ 
ated form than that of formulae (1) and (2) of Art. 87 when the 
higher characteristics are taken into account. The lines of re¬ 
gression which give this connexion are no longer right lines, but 
are in general curves resembling hyperbolae. If, in the example of 
this chapter, we had had also to consider measurements of firs of 
age 20 to 30 years (instead of only 3 to 4 years), it would have 
been necessary, to solve the problem of correlation, to take the 
higher characteristics of the correlation function into account. 


Cap. XV. 


Abridged Methods for calculating the Characteristics. 
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I have treated in previous chapters the general methods for 
computing the coefficient of correlation. In special cases, which, 
however, may be of great practical importance, these methods fail 
us. Particularly does this happen when, for some reason, the 
statistical material is divided into a very small number of 
classes. The methods of this chapter refer to this division. 

To treat these problems we use certain functions that have not 
been needed previously: firstly the integral 


X 



00 


and secondly the inverse of this function, denoted by Thus 

1. ^In the tables, Q is designated --Tr.> 



(2) 


“ = CW; * = ^(«)- 


For the shape of these functions see the tables at the end ut 
this book* I merely remark here that 


( 3 ) 


Q(-x) = 1-Q(x), 
R(t — a) = — R (a). 


92 Computation of the mean and dispersion from material divided 
into three classes. 

Let some attribute of an object be under study, and let x de¬ 
note the degree of the attribute. If, then, xj and X 2 are two 
values of the attribute, and we know the number (4j) of individu¬ 
als whose attribute is smaller than xj, and the number 
attribute is smaller than X 2 , as well as the whole number (IV) of 
individuals, we can compute the mean and dispersion of x.^ 

First we have 


A, = NQ(X,), 
A, = NQ(X,), 


where Xj and X 2 are the normal coordinates corresponding to X| and 
X 2 , so that 


(5) 






= 1 ,. 


But from (4) we have 



where /? designates the function /? of Art. 91 . 

Since A^/ZV and A 2 //V are known, we can compute Xj and X 2 fro"i 
Table 42 at the end of the book. Equations (5) then give the 
values of M and o-, and so we have 


2. If, as well as the mean and the dispersion, the total number of 
individuals is unknown, the problem can be solved similarly. I refer you, 
’n this connexion, to ^5^. 
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(7) 


Xj — X, 

Xx-X,' 


M = X| — a Xi. 


In this process it is irrelevant whether the statistical 
series treated be homograde or heterograde. 

^ Example. We return to the first example of the computation of 
the dispersion (p. 16 )• The full distribution, of the number of 
boys per 500 births, is given in Table 4 . Let us assume that 
we only knew that in 147 of the 576 cases investigated there 
were fewer than 249 boys in 500 • and in 531 cases fewer than 
274 . Then 


xi = 249, = 274, 

A, = 147, i4, = 531. 


Further N = 576 . 

So we obtain A^/N = 0.255 . A 2 /N = 0.922 . and from Table 42 
* -0.659 , X 2 = *♦■1.419 , and hence 


a = 


274 — 240 


= irrrr = 12.o 


1.419-f-0.657 2.076 

Af == 249 + 12.04 X 0.859 = 256.9, 

in good agreement with the values found in Capp. II. and III. 

9H As second example I take the following. In measuring the 
cephalic index of humans, anthropologists distinguish between 
dolichocephalics with an index smaller than 75 , brachycephalics 
with an index greater than 80 , and mesocephalics with an index 
between 75 and 80 . In the work of C. M. ^urst and F. C. C. Hansen 
[9 , p. 121 ] is found the following synopsis of the distribution 
of dolichocephalics, etc., for different times and places. 

Table 37. 

Distribution of tho eophilie indox. 



Dolichocephaly 

<75 

Mesocephaly 

75 — 80 

Brachycephaiy 

>80 

Greenland. 

«4*/o 

157, 

1 % 

Sweden* stone age.,, 

si»yf ^ 

407, 

97 . 

^ iron age,,, , 

eeVo 

29 7» 

57 , 

y modern age. 

30% 

577 , 

137 . 

Bavaria . 

_ SJIi!_ 

167. 

837. 
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We denote the cephalic index by x: in this case X| « 75 , 
X 2 - 80 . By the rules of Art. 92 we have 

Table 38. 



AtiN 

AiiN 

X, 

X, 

m 

Af 

Greenland,. 

Om 

0.99 

-f 0.99 

+ 2m 

3.7 

71.8 

Sweden, atone age. .. 

0.51 

0.91 

•f 0.08 

+ l4M 

3.8 

74.9 

9 iron age .... 

0.66 

0.95 

-|- 0.41 

-hlM 

4.1 

73.8 

„ modern age. . 

O.60 

0.87 

— 0.52 

4- 1.18 

34) 

70.6 

Bavaria. 

0.01 

0.17 

— 2.88 

-0.95 

3.6 

83.4 


It must however be noted that the mean error of a statistic is 
larger when confuted by the abridged method than when cooputed by 
the full method. 

^ Computation of coefficient of correlation from four subclasses 
in heterograde statistics. 

In a collection of N individuals we consider two attributes, 
A and B, of the individuals. We denote the degree of the attribute 
A by X, that of the attribute B by y. Now we undertake a distribu* 
tion of the N individuals into the following four subclasses: 


Subclass a:Individualswithx ^ y < y\y 


» 


b: 


9 


c: 


9 


d: 


^ x^x^; y<yi; 
„ xcx,; y>yi; 
y, XT^X^y y>yy. 


a 


b 


X 


This distribution may be seen 
in the adjacent figure. The 
intersection of the two right 
lines has here the coordinates 

X * XI. y * yi* 

Obviously we have; 




I where a, 6, c, d here designate 

the number of individuals in the 
corresponding subclass. 

Designating the normal coordinates of the point of intersec¬ 
tion by h and k, we have by formula (6); 
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( 8 ) 



Since a, b, c, d are known, we can compute the quantities h 
and k from (8). We cannot obtain the mean and dispersion of each 
of the two attributes; for this the knowledge of more subclasses 
were necessary. We can, however, compute the coeffkient of 
correlation (r) of the two attributes. The solution of this 
problem, given by Pearson, is the following. We introduce H and K, 
setting 

(9) H = (Poih); K = ip,(k); 

a short table of cp^ is given in Table 41 at the end. r is then 
obtained from the following equation: 

+ (A* - 3) A (A* - 3) + (A* _ 6 A* + 3) (A‘ - 0 A* -f 3) + 

+-^A(A«— 10 A* + 15)A(A* —IOA*+ 15) + 

^(ft* _ 15 ft* 4 . 45 ft* _ 15) (jfe« _ 15 A* + 45 A* — 15) + 

5040 

+ ... 


If r is small, the root of this equation can easily be 
expanded in a series of powers of {ad-bc)/N^HK. In other cases the 
root may be found by trial. Psarson [19 , XXIX-XXX: 1-lv, 42-57 ; 
20 , V-IX: xliv-lxxix. 73-109] has published various tables to 
facilitate this conputation. 

Conputation of coefficient of correlation from four subclasses 
in homograde statistics. 

We consider two attributes, A and B, of the individuals of a 
certain population. The individuals may then be distributed intc 
four groups, which we shall denote by the synbols 


AB^ Abf aB, ab 



Ihis symbolism is to signify, that those groups possessing the 
attribute A have the A in their symbols; those lacking it the 
a; likewise with B and b; so that we may consider a = non-i4, 
b = non-B. 

Taking an individual of the population at random, we shall 
designate the probability that this individual belong to a certain 

of these groups by pj, pj, P 3 , P 4 . respectively; so that pj is 
equal to the quotient of the nuirber with the attributes A and B by 
the total number of the population. 

Now let this process be performed not once, but s times 
(assuming that both s and the population P are large numbers); 
then, by a theorem known from probability, the probability 
B(i?}|, 1172 , m 3 , m^) of obtaining in these s trials 

iHi Individuen aus der Gruppe AB^ 

Tfl^ n rt y* A by 

» » » » ^ 

BI 4 n 7 * « » ^b 

is given by the following formula (where s * mj ♦ /n 2 + m 3 ♦ m^): 


( 11 ) B(mnm„ms, m^) = 


K 1^ 


mi m# 11I4 
pi pi p* Pi • 


This formula may be treated by the same method I have us^ in 
my paper *Die strenge Form des BESNOUtt»schen Theorems I 
'Confine myself here to the largest term of the solution. The 
foroula of Moivre and Stirling furnishes the approximations 


( 12 ) 


|S_== e VTirS, 

Im, — e”"*' nil"** V2nm, 


etCi , so that 

(13) 


B (m„ m,, m,) = 



[”■1 

\sptl 

-<m, +V>) 

1 1 

isPai 

-im, + 'h) 

1 1 

[-».] 

-(m4 + Vs) 

1 

V(2jtys»ptptpiPi 


The expression in the numerator is treated in the usual manner 
by setting 
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(14) 


(-^1' 

\5pil 


(«■ + '/») 


-(«, + */») lo*-^ 

SPi 


== e 


(15) 


and expanding the exponents in series. 

For this purpose four numbers ijf ^3* ®nd are intro¬ 

duced by the relations 

nil = SPi + hy 

m, = sft + 4, 

«,=«/>, + 4, 

»»»4 == sp4 + 4 
SO that (because Pj + P3 + P4 = 1 ) 

( 16 ) + 4 + 4 + ^4 = 0 . 

Expanding the exponent in (14) in powers ot we have first 

ifti 4 4* I 4* 


log 


SPi 


$pi 2 s*Pi* ‘ 3 s*Pi 




whence we obtain, by keeping only the lowest powers of Fi, 

(.7) (», + V.) 1.,^ -/.(. + + 

(~ •I’p.’ as’p,') ■* 

Similarly developing the remaining factors in (13), the first 
term vanishes on account of (16). The largest of the remaining 
terms in (17) is 


(18) 


2spt • 

One might be inclined to believe that 


(19) 


4 

2sp, 


were of the same order of magnitude as (18), or even greater, and 
indeed this is the case for very small values of If, however, 
ll approximates spi, (18) is the most important term. The term 
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(19), together with the term in /|3, determines the ‘skewness* of 
the curve, and will be neglected here. 

Thus expression (13) becomes 


( 20 ) 


B(m„ m„ 


m„ m+) = 


2spi 2spt 2sp» 2 spa 


Formula (20) is also applicable to correlation between 
mutually exclusive attributes. First, however, we have to solve 
the following correlation problem; 

What is the probability of obtaining, in s trials, tj individ¬ 
uals with the attribute A and ^2 with the attribute B? 

Obviously the individuals with the attribute A comprise 
first the individuals of the group AB and second the m 2 individ¬ 
uals of the group Ab. Likewise the ^2 individuals with the 
attribute B comprise the m^ individuals of the group AB and the m^ 
individuals of the group aB. Thus 


= m, 4 - m„ 

U— "*i + '"j- 


Moreover, 


s = m, + m, m, + m*, 


so that the problem leaves one of the numbers mj, /n 2 , m^, m^ 
undetermined. Let m^ be this number. We can then express m|, m 2 . 


( 21 ) 


(22) 


tl. 

<21 s, and 

mj; this 

ffll 

= nil. 


m. 

= tl — m„ 


m. 

~ tj nil. 


"*4 

= s 4- — 

ti — t- 

to 

put 


ti 

= spi 4- spt 

+ 

t 

= spi + sp. 

4 
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Qg where in » ^ 2 * 
instead of (21^) 


are to be considered small nunbers. Then we obtain^ 


(23) 


/. = It, 

k — ^ iu 
4= 
h~ 4 


If, inserting these values in (20), and expressing the right side 
of that equation in terms of , A. 2 , and ij, we ask for the 
probability of certain values of X.| and X. 2 , we must let li run 
through all possible values betwen -<» and +» .3 The probability 
sought is then obtained by summing the terms corresponding to 
these values. Let it be denoted by BfA-p ^^ 2 ^* that 

4-00 

(24) B(^„ ^) =J^dZ, B(»n„ m,, m„ m^). 

— oe 


If we put 


then 


B {fitly ftl^y tn^y 


m^) = 


_ e _ 

V(2 nY s^PiPfPtpA' 


0 — {Ui -f* ^ "f* “f“ ^4) 4*— 2 4 2* + ^^4 (^14” ^i)l 

+ (<ij + <14) ^1* -4" ^4) + 2 dj 

where 

(25*) a, == X/sp, (r — 1,2,3,4). 


Applying the formula of integration 


(26) 


-f 00 




— 00 


we have 

3* This, of course, is not strictly true, but may be used in practice 
as a good approximation* We must remember that i 2 » and 
therefore likewise X., A^, are considered as small numbers in relation to 
apy, ap ap.,but that the squares of the i and A are not so 
considered. 


9^ 




The quantities cr^, ^ 2 , and r are then given by the following 
formulae: 

(a, + a4) (g, a,) 

a, + a* + a, + a, ’ 

(a,-Ha 4 ) ( a, + a,) 

^1 "h ®1! "I" <*4 * 

r 1 _ aia4—a^a, 

1 r* Oi Of ai ~t“ ^ 

Therefore 


1 1 



y. __ ^4 ^ ^3 

a, (a, + a 4 )(ai+a,)’ 

(29*) 

y __ ^4 ^3 

(a« + a4)(ai + a,)’ 

and by multiplication 


(29**) 


_( a 4 a«—a, a,)* _ 

(a* + a4) (a, + a4) (a, + a.) (a, + a,)' 


The first two of these formulae shew that r has the same sign 
as the quantity coefficient of correlation may be 
positive or negative and is always numerically smaller than unity, 
as is seen iimiediately from (29**) or (29). Introducing the values 
of the a from (25*), the expression for r takes the form 
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PiPt — psP$ 


, V(A + Pt) (Pi + Pa) (Pa + Pa) (Pa + P^ 

A rather lengthy algebraic manipulation yields the expression 

Mn I_ ^ (^1 "h ^ “f“ H~ ^4) ^ fls <14 

(aj + 0,) (a, + ^) + ^4) (a^ -f Ui) ’ 

so that, by (29), 

a * — (^1 4“ (^3 4“ ^ 4 ) 


s aj j* as a 4 

(f • = (^1 4“- ^») (^8 + (I4) 
s cii dtji d^ 


or, introducing , pj, p^, p^, 

(32) a, * = S (pj -f- P*) (Ps + 

as*= S(Pi + P8)(P2 + P4). 

These expressions shew that gives the dispersion of the 

frequency curve obtained in answer to the problem: *What is the 
probability of obtaining in s trials individuals with the 
attribute 4?* 

The quantity cr^ has analogous signification. These frequency 
curves are obtained by integrating the expression (27) for 
B(X.p X^) over all values of X 2 X.|) from “® to -f® . 

I call the expression (30) for r the Bernoulli coefficient of 
correlation and designate it by r^, since it corresponds to that 
correlation that exists between two attributes with Bernoullian 
dispersion. 

The schema for investigating this correlation may conveniently 
be written in the following form (the notation was first introduced 
by Yule [23])- 


Table 39. 

Schema of correlation in homograda atatiatica. 
The attributes A and B may occur simultaneously. 


(AB) 

(am 

(m 

(Ab) 

(ab) 

(b) 

(A) 

(«) 

N 
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Here (i4B) denotes the nuinber of individuals simultaneously 
possessing the attributes A and B, (Ab) the number possessing the 
attribute A but not the attribute B, (A) - (AB) + (Ab) the total 
nijti>er possessing the attribute A, etc. 

The- expression (30) for can then be written in the follow¬ 
ing form: 

(33) fB = 

The Bernoulli coefficient of correlation vanishes when 


3) (ab) — {Ab)(aB) 
V(A){B){am 


(AB):(Ab) == {aB):(ab), 

i.e. when the ratio of the nunber of individuals with the attrib¬ 
ute B to the number without this attribute is the same regardless 
of the presence of the attribute A, Obviously the attributes A and 
B have then nothing to do with one another. 

Furthermore the coefficient rg vanishes when 

1. {AB) = {Ab) = 0, so that (>1) = 0, 

2. {AB) = (aB)= 0, , , (B) = 0, 

3. {ab) = {Ab) = 0, , , (ft) = 0, 

4. (aft) = (aB) = 0, , , (a) = 0. 

It is easy to see that in these cases no correlation can be 
proven to exist between the attributes A and B. 

Contrariwise the Bernoulli coefficient of correlation has the 
value if 

(Ab) = (aB) = 0, 

so that all individuals possessing the attribute A have also the 
attribute B. 

It has the value -1 if 

(AB) = (ab) = 0, 

so that all individuals possessing the attribute A lack the 
attribute B, and vice versa. 

In all other cases r has a value greater than -1 and smaller 
than -1-1 . 

If the value of ^2 is known and the most probable value of 
--which we shall call ilj--is sought, we have to differentiate 
B(\y, ) with respect to ^2 and set the differential equal to 

zero, lie then obtain 
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(34) 


A, 
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If the value of is given, the most probable corresponding 
of ^2 obtained from the fornula 


A, 


r~ 


so that the regression ^efficients are obtained as in heterograde 
statistics. 

From (29*) are obtained the following expressions for the 
regression coefficients: 

_ {AB){ab)~(Ab)(aBi 

,35) <«><'> ’ 

a, _ {AB){ab) — {Ab)(aB) 

<r. (A) (a) 


f As an application of these formulae 1 shall treat the fol> 
lowing case: 

At the general hospital in Copenhagen, at one time, the 
children afflicted with diphtheria were treated alternately with 
the serum of Behring and by older methods. Professor Fibiger 
reports [8 , p. 317J that in 239 cases treated with serum 8 
children died, whereas out of 244 children treated without 
serum, 29 succumbed to the disease. How large is the BERN0Ui.Li 
coefficient of correlation between serum and recovery? 

Obviously we have here an example of homograde statistics, 
because a gradation of the attribute death cannot occur, and 
further, as far as was reported, the quantity of serum was the 
same in all the 239 cases. 


Table 40. 

Corralation bstwasn ssnim and racovery. 



Recovery 

Death 


With fleruin . 

231 

8 

239 

Without lerum • • • 

215 

29 

244 


446 

37 

483 
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From formula (30) we have 


_ 231 X29 — 215 X 8 

~ 1/ 239 X 446 X 37 X 244 


The coefficient of correlation between recovery and the serum 
of Behring is thus +0.160 . Perhaps a stronger correlation would 
have been expected from the figures. In this connexion it should be 
noted that even without serum a significant percentage of the cases 
of illness recovered. If it were a case of an illness froir which, 
say, 100 per cent, deaths occurred without serum, but only p per 
cent, with serum, then obviously we would have 


]/] 


00 —p 


and for small values of p 


Tfl == 1 


100+ p 


100 * 


With respect to the mean error I refer to the papers of Yule 
[ 24] and of WicKSELL [22]. 

Yule finds 

se(r)-l r+(r^^r) 

(A) {a) + (B)(fr)' /• 


00 The attributes that we have treated up to now may occur 
simultaneously in one individual. Let us now consider the case 
where the attributes between which the correlation is sought are 
all homograde but c&nnot occur simultaneously in one individual. 
Such attributes are called alternative attributes. 

If there are only two alternative at t r ibutes-and B--the 
solution is simple. If we take N individuals and find that of 
these have the attribute 4, we know then that N individuals 

have the attribute B, The coefficient of correlation here has the 
value -1 • If, however, there are three attributes--4, B, and C-- 
the solution is different. If, of N individuals, possess the 
attribute 4, then of the remaining N • individuals a part have 
the attribute B and a part the attribute C. The problem will be to 
calculate the probable number of individuals with the attribute B 
or C, The same problem occurs with four or more attributes. 

Let us take the case of four alternative attribvites 
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A, B, C, D. 


Let s, the total number of individuals, be assumed constant. 

Designating by lf|, 1 ^ 2 , the mean number of individuals 

with the attributes A, B, C, /), respectively, so that 

(36) Af, + Af, + itf, + Af, = s, 

we define 

(36*) A = M^:s, p, = Af,:«, p, = M,:s, p, = M^ts, 

thus considering , p 2 , P 3 , p^ as the respective probabilities of 
obtaining an individual of the group i4, J9, C, D, i f an individual 
be chosen from the population at random. As an obvious consequence 
of (36), 

(37) pi -4- Jps + pB + P4 = J- 

If we now take an arbitrary sample of s individuals from the 
population and obtain , m 3 , individuals with the attrib¬ 

utes 4, Bt C, /), 


If now, as in the previous problem, we set 


ffir — spr -f* If (r — 3, 4)j 

so that from ( 37 ) 

(38) + 4 + ^ = 

we may ask for the correlation between two of these numbers 1, say 
between and i 2 < 

We obtain for the probability of the simultaneous deviations 
^ 2 » ^ 3 » ^ 4 » previous problem, 


(39) 


B(m„ m,, m4) 


U* 

2spi 2spt 2spt 2sp4 

y(2nys*ptptp,p4 


Eliminating by means of (’38), and integrating (39) from -® 
to with respect to /^, we obtain the probability of simultane¬ 
ous occurrence of the deviations i| and i 2 * 

This probability--B(i|, l 2 )--is 
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where 


y(2yr)*sVift (P* + i>4) ’ 


1 


1—r,. 

* 1<^1 

1 

i 



1 

1 

l-Tl.* 

tft* 

r„ 

1 




r.(9< = » — P/). 


whence follows 

r — A 

ni — —“> 

_ _ p* 

T — — ’ 

r »_ AP» 

?* 

The first two of these equations shew that r is negative, 
so that 


an equation first given by Pearson [17 , no. 92]. 
So we have 

or, = Vs Pi q,y or, = Vsptqt, 
^1 ^IS ^ pi Pi' 


This treatment can obviously be extended to an arbitrary 
nixnber of attributes in a population. 

QQ As a simple example of formula (40) we may take the following 
^ problem: 

How large is the correlation between the nurber of spades and 
clubs at whist? 
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99 . *• here have two attributes that 

in one individual. We have 


cannot occur simultaneously 


A=A=l/4. 


«i = «t=3/4, 


SO that by formula (40) 


== —1/3. 

To illurrinat-e the concept of correlation, the significance of 
this result should be more completely explained. 

The total number of cards for each player is 13 . The mean 
nurber of cards of each suit is 3.25 . If a player has more than 4 
spades, then the average number of cards of the remaining suits 
must be less than 3 . If, say, he receives 7 spades, then on the 
average he must have 2 clubs, 2 diamonds, and 2 hearts. To an 
excess of 3.75 spades (over the theoretic mean of 3.25 spades) 
corresponds an average deficiency of 1.25 cards in each of the 
other three suits. The excess in one suit is uniformly divided 
among the other suits. This connexion between the number of cards 
in different suits is technically expressed by the statement that 
the coefficient of correlation between the number of cards in two 
different suits at whist is negative and has the value -1/3 . 
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Table 41. 


u 

—x~S ( u) 


.01 

2.3263 

.99 

.02 

2.0537 

.98 

.03 

1.8808 

.97 

.04 

1.7507 

.96 

.05 

1.6449 

.95 

.06 

1.5548 

.94 

.07 

1.4758 

.93 

.08 

1.4051 

.92 

.09 

1.3408 

.91 

.10 

1.2816 

.90 

.11 

1.2265 

.89 

.12 

1.1750 

.88 

.13 

1.1264 

.87 

.14 

1.0803 

.86 

.15 

1.0364 

.85 

.16 

0.9945 

.84 

.17 

0.9542 

.83 

.i8 

0.9154 

.8a 

.19 

0.8779 

.81 

.20 

0.8416 

.80 

.21 

0.8064 

.79 

.22 

0.772a 

.78 

.23 

0.7388 

.77 

.24 

0.7063 

.76 

.25 

0.6745 

.75 


*= tt(u) 

U 


u 

-x=-jr(u) 


.26 

0.6433 

.74 

.27 

o.6ia8 

.73 

.28 

0.5828 

.72 

.29 

0.5534 

.71 

.30 

0.5244 

.70 

.31 

0.4959 

.69 

.32 

0.4677 

.68 

.33 

0.4399 

.67 

.34 

0.4125 

.66 

.35 

0.3853 

.85 

.36 

0.3585 

.64 

.37 

0.3319 

.63 

.38 

0.3055 

.62 

.39 

0.2793 

.61 

.40 

0.2533 

.60 

.41 

0.2275 

.59 

.42 

0.2019 

.58 

.43 

0.1764 

.57 

.44 

0.1510 

.56 

.45 

0.1257 

.55 

.46 

0.1004 

.54 

.47 

0.0753 

.53 

.48 

0.0502 

.52 

.49 

0.0251 

.51 

.50 

0.0000 

.50 


x= «(u) 

U 


loa 



Table 42. 


X 


^0 



*4 



0.00 

. 5000000 

.3989433 

• 3989433 

•0000000 

1.1968 

.0000000 

5.9841 

.01 

•5039894 

. 3989 333 

• 3988834 

.0x19673 

X.1965 

.0598344 

S •9830 

.09 

•5079783 

. 3988635 

.3987030 

.0339386 

1.1956 

.1196368 

5.9758 

.03 

. 5xxg665 

.3987638 

.3984037 

.0358779 

1.1941 

.1793356 

5.9653 

.04 

•5159534 

• 3986333 

.3979855 

.0478093 

X.X930 

.3389189 

5.9507 

.05 

• 5199388 

• 3984439 

.3974478 

•0597x68 

1.1894 

.3983350 

5.9319 

.06 

•5a39aaa 

. 3983348 

.3967913 

.07x5945 

X.x86x 

.3575435 

5.9089 

.07 

• 5 a 7903 a 

•3979661 

• 3960x60 
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Table 44. 


« = |s|< 

.00 .000 

.03 .180 

.06 .251 

.09 .302 

.353 
. 15 .406 

.18 .445 

.21 .478 

.24 .496 

.37 .518 

.30 .524 

.33 .532 

.36 .509 

.39 .483 

.43 .441 

.374 

.48 .253 

.50 .000 
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Index. The numbers refer to articles 


a * absence of attribute A 96 
a - subclass 95 
A = attribute 95 , 96 • 98 
“ 1/ap^ 96 

^1* ^2 ~ partial numbers of 

individuals 92 
Abweichung, mi t tiere 9 
Alternative attributes 98 
Army 86 

Arrays* dispersion of 86 
means of 86 
Asymmetry 4 
Attribute 1 
absence of 96 
degree of 1 

b = absence of attribute B 96 
b = Af • Af^ S 
b - subclass 95 
B - attribute 95 , 96 , 98 
B = joint probability 96 , 98 
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BERNOULLI 
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25 , 
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53 , 61 . 96 * 97 
JS “ average secular change 39 
^3* ^ 4 $ /^5 ~ characteristics of Type 
A frequency curve 62 
Bortkiewicz 29 , 70 

c subclass 95 
C = attribute 98 
Change, secular, average 39 
Characteristics 4 

higher, of correlation function 
88 

of freiuency curve 63 
Charlier. Essie 85 
SONJA 85 

Class interval 6 , 57 

in correlation series 85 
Class mark 56 

in correlation series 85 
Classes 6 , 8 , 62 
Coincident regression lines 87 
Coiig>arison, number of 1 , 42 , 52 
Coiiq>uting machine 58 
Computation, check on 8 


of correlation, check on 78 
Correlation 76 

coefficient of 76 , 77 
Bernoulli 96 
mean error of 82 
tetrachoric 95 
Correlation table 85 , 86 
Correspondence, good 19 
satisfactory 19 

d - subclass 95 
D - attribute 98 
Degree of connexion 75 
A = finite difference 72 
Deviation, average 11 , 46 
mean error of 16 
standard 9 
Difference, finite 72 
mean error of 17 
Directly observed series 45 
Dispersion 4 , 7 , 9 , 57 

approximately computed from 
average deviation 11 
Bernoulli 21 

of reduced series 42 
Lexis 28 
mean error of 15 
Poisson 26 

Distrubancy„ coefficient of 37 

E ~ excess 4 
£lement 1 

meaii error of 13 
Elements, number of 56 
El^entary errors 54 
hypothesis of 53 
6 = mean error 12 
Error, mean 12 

of skewness and excess 68 
sources of 54 
Evolutory changes 38 
Excess* 4 , 64 

mean error of 66 


F = frequency 6 

° factor of reduction 42 
/2 factor of reduction 46 
FIbiger 97 
Frequency 6 
Frequency curve 56 
Frequency equation 56 
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fJrst 67 , 94 
Galton 75 

72 = eccentricity 72 
'^2* "^3* ^4 “ cheracteriitici of Type 
B frequency curve 72 

Gauss 9 , 53 . 56 , 57 , 69 
Graphic nethod in studying co¬ 
variation 75 

h = normal coordinate 95 
H = 95 

Hagen 53 
HANSEN 94 

Heterograde individuals 3 , S3 
Heterograde statistical series 60 
Homograde individuals 2 , 53 

JOHANNSEN 65 

k = normal coordinate 95 
K = 4>o(k) 95 

Kurtosis, see excess 

1 ^ ^ • up 96 
L = LEXIS ratio 29 
Laplace 31 , 55 

= characteristic of Type B 
frequency curve 72 

= *1 - «Pl - Spj 96 

LEXIS 28 , 29 , 30 , 31 . 32 , 33 , 
34 , 35 , 36 , 38 , 43 , 44 9 
51 . 53 

List, primary 1 , 3 

SI == element 1 
M “ arithmetic mean 4 
Mg - BERNOULLI mean 21 
= LEXIS mean 28 
Mjp = POISSON mean 26 
Mq = provisional mean 5 
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= mean 

of simple 

r e du c 
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M 2 - mean of reduced and weighted 
series 46 
Mean, arithmetic 4 


BERNOULLI 21 
harmonic 42 
LEXIS 28 
mean error of 14 
POISSON 26 
provisional 5 
Medium 4 
Moivre 96 

Most probable value 64 
in correlation series 80 

n = element 77 
N - number of elements 4 
Non-normal frequency curve 69 
Normal coordinates 58 , 62 , 95 
Normal curve 56 , 57 
Normal dispersion 29 
Normal distribution 57 
Normal frequency curve, statistical 
object characterised by 69 

Observed frequency curve 62 
Oscillatory changes 38 

p = probability 96 
P - population 1 
Pq = mean probability 26 
Bearson 9 , 69 , 95 , 98 
periddic changes 38 
♦= exponent in i?Ch* I 2 ) 

exponent in fJCsij, mj* «3t « 4 ) 
96 

4^ ~ probability function 58 
exponent in V 2 ) 96 

^3» ^4, ^5 " derivates of prob¬ 
ability function 62 
Poisson 26 , 27 , 28 , 29 , 51 
53 , 70 
Population 1 
Probability, mean 26 
Probability function 58 
Probability integral 91 

= Type 6 generating function 72 

Qq - mean probability 26 
Q = probability integral 91 
OOetelet 69 

r * coefficient of correlation 77 
R = inverse of Q-function 91 
rg = BERNOULLI coefficient of 
correlation 96 

**12 “ corr^llotion between A 

and B 98 


119 



Ratio. Lexis 29 

mean error of 34 
Reduction, factor of 42 
Regreaaion 80 

coefficient of 80 
Regretaion linea 80 . 86 
non*linear 89 
Retzius 67 

p ^ coefficient of diaturbancy 37 

a ^ number of coopariaon 1 . 52 
5 ^ akewneaa 4 

S|. 52 ' aimultaneoualy obaerved 
atatiatical aeriea 77 
Secular changea 38 
Seriea. correlation, diaperaion 
of 77 

meana of 77 . 98 
proviaional meana of 77 
reduced, aiiiple. diaperaion of 46 
mean of 46 

and weighted, diaperaion of 46 
mean of 46 

atatiatical. reduced 42 
and weighted 46 
cr « diaperaion 4 . 7 
cTg - Bernoulli diaperaion 21 
cri ® LEXIS diaperaion 28 
cTp * Poisson diaperaion 26 

^ diaperaion of aimple reduced 
seriea 46 

^2 ~ diaperaiona of correlation 
aeriea 77 

0*2 ^ diaperaion of reduced and 
weighted series 46 
Significant figures 18 
Siople reduced seriea 50 
Simultaneously obaerved statistical 
aeriea 51 . 75 
Skewness 4 . 64 . 69 
mean error of 66 
Statistics, alternative 2 
heterograde 3 
homograde 2 
qualitative 3 
STIRLING 96 

Streuufid 9 


Subclass 85 . 95 
Subnormal dispersion 29 
Sum. mean error of 17 
Supernormal dispersion 29 
SUSSMILCH 50 
Syoptofflatic changes 38 

f| = «j + iii2 
<2 = mi + »3 96 

Theoretic frequency curve 62 . 66 
0 - average deviation 11 . 46 
Two normal curves, decomposition 
into 69 
Type 69 
Type A 56 

frequency curve of. character* 
iatics of 63 
general equation for 62 
Type B 56 

frequency curve of 70 
characteristic of 72 

Variance 9 

w == class interval 6 

*2 ~ class intervals in 
correlation series 85 
WICKSELL 97 

X ~ class mark 56 

X ® class mark in correlation 
series 85 

X ~ intensity of attribute 3 
X = moat probable value of x 
for given y 80 
X ~ normal coordinate 58 
X*array 86 

y = class mark in correlation 
aeries 85 

y * number of elements 56 
y* moat probable value of y 
for given x 80 
Y * normal coordinate 58 
y-array 86 
YULE 96 . 97 
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